THOMPSON RIVERS UNIVERSITY Robust Regression Models with Missing and Censored Data via the EM Algorithm By Minoli Randika Munasinghe A THESIS SUBMITTED IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF Master of Science in Data Science KAMLOOPS, BRITISH COLUMBIA September, 2024 SUPERVISORS Mateen Shaikh & Erfanul Hoque © Minoli Randika Munasinghe, 2024 ABSTRACT This study addresses the challenge of fitting a simple linear regression model to data with missing or censored values by presenting a novel approach based on the Expectation–Maximization algorithm. The proposed approach is developed to estimate the parameters of models under various incomplete data scenarios, including missing, left censored, right censored, and interval censored observations in bivariate normal data. Extensive simulation studies evaluate the performance of the proposed approach across varying sample sizes, incomplete proportions, and correlations between variables. The study compares the proposed approach with the existing models, which ignores incomplete observations in model fitting. Evaluation metrics including bias, variance, Root Mean Squared Error, Coverage Probability, and Relative Root Mean Squared Error are used to assess the estimates from both approaches. The proposed method is applied to real datasets to validate its effectiveness against naive estimates. Standard error, confidence interval, and interval width assess the precision and accuracy of parameter estimates for both models, while the Akaike Information Criterion selected the best–fitting model. The results show that the proposed approach provides more accurate and precise parameter estimates compared to the naive approach in both simulation studies and real data applications. Keywords: Bivariate normal distribution, Expectation–Maximization Algorithm, Interval Censoring, Left Censoring, Missing Data, Regression; Right Censoring ii ACKNOWLEDGEMENTS I am incredibly grateful to my supervisors, Dr. Mateen Shaikh and Dr. Erfanul Hoque from the Department of Mathematics and Statistics at Thompson Rivers University, for their continuous guidance throughout my academic journey. Their expertise, encouragement, and valuable feedback have been crucial in shaping the direction of this thesis. I also thank Dr. Robin Kleiv, my committee member, for his contributions. I acknowledge the guidance of Dr. Mohammed Tawhid, the MSc Data Science program coordinator, for directing me through the program requirements. I am also thankful to the faculty members of the Department of Mathematics and Statistics, specifically Dr. Jabed Tomal, Dr. Mila Kwiatkowska, and Dr. Roger Yu, former program coordinator, for their contributions to my education in various ways. I am grateful for the financial support provided by Dr. Mateen Shaikh and Dr. Erfanul Hoque for this research project, which was funded by the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant and the Research Accelerate Grant from Thompson Rivers University. Special appreciation goes to my beloved husband, Muditha Ekanayake, whose selfless encouragement and endless support have inspired me throughout my academic journey. I am also deeply thankful to my parents and brothers for their support in every situation. I would like to extend my thanks to my friends and peers for their encouragement and moral support; their companionship and shared experiences have made this journey more enjoyable. Lastly, I express my deepest gratitude to the Sri Lankan free education system for laying a strong foundation for my academic studies. iii Contents 1 Introduction 1 1.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Research Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.3 Significance of the Study . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.4 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Literature Review 6 2.1 Missing Data Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Censoring Data Patterns . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.3 Introduction to Regression . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.1 Simple Linear Regression (SLR) . . . . . . . . . . . . . . . . . 10 2.3.2 Least Squares (LS) Method . . . . . . . . . . . . . . . . . . . 12 2.3.3 Maximum Likelihood (ML) Method . . . . . . . . . . . . . . . 13 2.4 Techniques for Handling Missing Data . . . . . . . . . . . . . . . . . 14 2.5 Expectation–Maximization (EM) Algorithm . . . . . . . . . . . . . . 17 iv CONTENTS 2.5.1 2.6 2.7 v Applications of the EM Algorithm in the Presence of Incomplete Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Univariate Normal Distribution . . . . . . . . . . . . . . . . . . . . . 20 2.6.1 Right Censored Univariate Normal Data . . . . . . . . . . . . 21 2.6.2 Left Censored Univariate Normal Data . . . . . . . . . . . . . 21 2.6.3 Interval Censored Univariate Normal Data . . . . . . . . . . . 22 Bivariate Normal (BVN) Distribution . . . . . . . . . . . . . . . . . . 23 3 Methodology 27 3.1 EM Algorithm for the BVN Distribution . . . . . . . . . . . . . . . . 27 3.2 EM Algorithm for Missing Observations . . . . . . . . . . . . . . . . 30 3.2.1 E–Step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.2.2 M–step . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Performance Metrics to Evaluate the Regression Estimates . . . . . . 34 3.3 4 Simulation Studies 40 4.1 Simulation Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 4.2 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 4.2.1 Results in the Presence of Missing Observations . . . . . . . . 43 4.2.2 Results in the Presence of Right Censored Observations . . . . 46 4.2.3 Results in the Presence of Left Censored Observations . . . . 50 4.2.4 Results in the Presence of Interval Censored Observations . . 53 4.2.5 Summary of the Simulation Results . . . . . . . . . . . . . . . 56 CONTENTS vi 5 Data Applications 57 5.1 Determining the Effect of Solar Radiation on Ozone Concentration–An Application to the Presence of Missing Observations . . . . . . . . . . 5.2 57 Exploring the Relationship Between Blood and Feather Lead Concentrations in Black–Crowned Night Herons (Nycticorax nycticorax)–An 5.3 Application to the Presence of Left Censored Observations . . . . . . 62 Summary of the Data Applications . . . . . . . . . . . . . . . . . . . 66 6 Discussion 67 6.1 Limitations of the Study . . . . . . . . . . . . . . . . . . . . . . . . . 68 6.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 A EM Algorithm for Different Censoring Types 77 B Additional Simulation Results 96 List of Figures 2.1 Scatter plots of BVN data in x and y variables, depicting different types of censored observations. Vibrant blue–green points denote censored data, including left censored, right censored, and interval censored observations. Black points represent uncensored data points. The red vertical line indicates the threshold in the x variable, while the green horizontal line denotes the threshold in the y variable. . . . . . . . . . 2.2 Traditional and modern missing data handling methods [Shylaja and Kumar, 2018] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 16 Boxplots for estimates of intercept (βˆ0 ) and slope (βˆ1 ), n = 100 and ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 . . . . . 4.2 9 43 Line charts for asymptotic variance of intercept (βˆ0 ) and slope (βˆ1 ) when ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 . . . . . . . . . . . . . . . . . . . . . . 4.3 44 Boxplots for estimates of intercept (βˆ0 ) and slope (βˆ1 ), when n = 100 and ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 47 4.4 Line charts for asymptotic variance of intercept (βˆ0 ) and slope (βˆ1 ) when ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 . . . . . . . . . . . . . . . . . . 4.5 48 Boxplots for estimates of intercept (βˆ0 ) and slope (βˆ1 ), n = 100 and ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 . . vii 50 LIST OF FIGURES 4.6 viii Line charts for asymptotic variance of intercept (βˆ0 ) and slope (βˆ1 ) when ρ = 0.2 over different left censoring proportions of 0.1, 0.3 and 0.5 and sample sizes 50, 100, and 500 . . . . . . . . . . . . . . . . . . 4.7 51 Boxplots for estimates of intercept (βˆ0 ) and slope (βˆ1 ), when n = 100 and ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.8 53 Line charts for asymptotic variance of intercept (βˆ0 ) and slope (βˆ1 ) when ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 . . . . . . . . . . . . . . . 54 5.1 Scatter plot of Ozone concentration (ppb) over Solar radiation (Lang) 59 5.2 Histogram and normal Q–Q plot for solar radiation . . . . . . . . . . 60 5.3 Histogram and normal Q–Q plot for Ozone concentration . . . . . . . 60 5.4 Histogram and normal Q–Q plot for log–transformed Ozone concentration 61 5.5 Scatter plot of lead concentration in blood and feather of Herons . . . 64 List of Tables 3.1 Data structure for variables x and y . . . . . . . . . . . . . . . . . . . 28 3.2 Scenarios with observed and missing data for x and y . . . . . . . . . 31 4.1 Simulation details for different sample sizes and correlations for a single proportion of x and y. . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 41 Bias, asymptotic RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and missing proportions 0.1, 0.3 and 0.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 45 Bias, asymptotic RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and missing proportions 0.1, 0.3 and 0.5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 45 Bias, asymptotic RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 . . . . . . . . . . . . . . . . . . . . . . 4.5 49 Bias, asymptotic RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and right censoring proportions 0.1, 0.3, and 0.5 . . . . . . . . . . . . . . . . . . . . . . . 4.6 49 Bias, asymptotic RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 . . . . . . . . . . . . . . . . . . . . . . . ix 52 LIST OF TABLES 4.7 x Bias, asymptotic RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 . . . . . . . . . . . . . . . . . . . . . . . . . 4.8 52 Bias, asymptotic RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 . . . . . . . . . . . . . . . . . . . . . . 4.9 55 Bias, asymptotic RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 . . . . . . . . . . . . . . . . . . . . . . 5.1 55 Comparison of model coefficients, standard errors, Lower Confidence Limit (LCL) and Upper Confidence Limit (UCL), and interval width of naive model and proposed model . . . . . . . . . . . . . . . . . . . 5.2 62 Comparison of model coefficients, standard errors, Lower Confidence Limit (LCL) and Upper Confidence Limit (UCL), and interval width of naive model, alternate model, and proposed model . . . . . . . . . 65 A.1 Scenarios with observed and right-censored data for x and y . . . . . 78 A.2 Scenarios with observed and left censored data for x and y . . . . . . 84 A.3 Scenarios with observed and interval censored data for x and y . . . . 89 Abbreviations BVN Bivariate Normal CDF Cumulative Density Function CI Confidence Interval CLS Complete Least Squares CP Coverage Probability EM Expectation–Maximization FIM Fisher Information Matrix LD Listwise Deletion LS Least Squares MCAR Missing Completely At Random ML Maximum Likelihood MLE Maximum Likelihood Estimate MSE Mean Squared Error PDF Probability Density Function RMSE Root Mean Squared Error RRMSE Relative Root Mean Squared Error RSS Residual Sum of Squares SLR Simple Linear Regression x Chapter 1 Introduction The quality and comprehensiveness of datasets play a vital role in data–driven research to ensure reliable and accurate analytical outcomes. However, the reality is that obtaining complete and perfectly recorded data for analysis is nearly impossible. Many datasets remain incomplete due to the lack of significant information about the observation of interest. This incompleteness can complicate the analysis into a complex problem that needs to be addressed effectively through an advanced approach. Missing and censored data are two major types of incomplete data. Missing data occurs due to multiple factors, including non–responsiveness of the respondents, technical limitations, errors in data collection, and drop–offs from the studies. For example, data collected from a healthcare study over a period of time may have missing observations due to patient dropouts or missed follow–up appointments. Censoring arises in numerous applications for reasons such as limitations in measuring instruments or experimental design. For example, air pollutant measurements can be censored due to the detection limits of the measuring instruments. In each case of missing and censoring, incomplete data can pose significant challenges, resulting in statistical inconvenience, increased problem complexity, biased estimates, and inaccurate analysis and decision–making. The impact can be serious depending on the pattern, percentage, and mechanism of incompleteness. In some cases, the missing data can be random, causing less impact on the results. In other 1 cases, the data can be missing in a way that introduces significant bias, creating misleading results. Moreover, censored data often produces higher bias and variance than missing data. Therefore, understanding and identifying the type, origin, and implications of incomplete data is crucial for researchers, as it plays a vital role in developing robust methodologies to mitigate its adverse effects. This study addresses the problem of fitting Simple Linear Regression (SLR) models to Bivariate Normal (BVN) data with missing or censored observations. It is motivated by the audiometric study conducted by De Lima Taga and Singer [2016], which explored the potential to predict behavioural thresholds based on electro–physiological thresholds. They utilized a Maximum Likelihood (ML) estimation approach for a SLR model, where the dependent and independent variables were subjected to interval censoring via the Newton–Raphson optimization algorithm. In addition, several studies proposed new methodologies for regression modeling while handling missing and censored data in different distributions. For instance, Miller [1976] estimated parameters of a SLR model by utilizing the Kaplan–Meier estimator for right censored response data. Guzmán et al. [2020] developed linear regression models on skew scale mixtures of normal distributions where response variable is left censored via the Monte Carlo Expectation–Maximization (MCEM) algorithm. Austin and Hoch [2004] estimated the regression coefficients for censored covariates using ordinary LS, partial LS, and full likelihood methods. In addition, Buckley and James [1979], Koul et al. [1981], and Ritov [1990] proposed various other procedures to obtain regression models for right censored data. However, to our knowledge, previous research has not incorporated the EM algorithm for fitting a SLR model when missing or censoring occurs in data. Several researchers also explored the applications of the EM algorithm across diverse data types, incomplete data scenarios, and distributions. Atkinson [1992] proposed a hybrid EM algorithm to estimate the parameters of a normal mixture model in the presence of right censoring. McLachlan and Jones [1988] applied EM algorithms to fit mixture models to truncated and grouped data, Lee and Scott [2012] developed multivariate Gaussian mixture models capable of handling both truncated and censored data, and Josse et al. [2018] explored the EM approach for estimating the parameters of a BVN distribution, specifically focusing on the scenario where 2 data is missing for one of the two variables. To the best of our knowledge, previous research has not specifically addressed the challenge of fitting SLR models for missing or censored BVN data. Moreover, our study emphasizes joint modeling for observations where both variables are censored. Therefore, this study aims to design an EM–based approach that effectively integrates these complexities for obtaining regression parameter estimates. 1.1 Problem Statement Traditional statistical methods frequently encounter challenges in effectively addressing various types of incomplete data, including missing observations, left censoring, right censoring, and interval censoring in regression modeling. While the EM algorithm has demonstrated flexibility in managing incomplete data across various distributions, existing research primarily focuses on obtaining only MLEs when incompleteness affects a single variable. This study aims to address the gap in the literature by developing and validating an EM–based approach designed for regression modeling with missing or censored BVN continuous data, where both variables may experience missing data or censoring in a unified framework. To the best of our knowledge, this specific problem has not been adequately addressed in a common framework, and it is crucial for fields requiring precise estimation of regression parameters, such as epidemiology, finance, and environmental sciences. Therefore, this study will focus on enhancing the accuracy and reliability of regression parameter estimates for BVN data. The proposed method will integrate the complexities of missing and censoring in both variables. This research intends to provide robust solutions that improve existing methods and contribute significantly to advancing statistical modeling in complex data scenarios. 3 1.2 Research Steps 1. Develop an EM based approach to estimate regression parameters of a SLR model in various incomplete data scenarios accurately. These scenarios include missing, left censored, right censored, and interval censored observations within both variables (x independent and y dependent variables) of BVN datasets. 2. Assess the performance of the parameter estimation scheme using the estimates obtained from the proposed method under varying data scenarios, such as different sample sizes, proportions of missing and censoring, and correlations between variables. 3. Compare the regression estimates derived from the proposed approach, which handle missing and censored data, with those from naive models that only consider fully observed data for fitting regression models. 4. Apply the proposed method to real datasets. 1.3 Significance of the Study This study extends previous research by addressing a variety of incomplete data scenarios in regression models related to the BVN distribution. It examines left censored data, where observations are below a minimum threshold; right censored data, where observations exceed a maximum threshold; interval censored data, where observations lie within a specified range; and missing data in varying combinations within a BVN distribution. Simulation studies are conducted to obtain parameter estimates of SLR models under varying data scenarios, such as different proportions of incompleteness, sample sizes, and correlations. The regression estimates obtained from the proposed approach are evaluated across these scenarios to determine its effectiveness in each scenario. The naive models obtained by fitting regression models on fully observed data are used to compare the performance of the proposed models. Real–data applications are in4 corporated to assess the applicability of the approach on real–data, demonstrating its potential to solve real–world problems. Therefore, this research contributes valuable insights for researchers dealing with regression modeling with missing or censored data arising from limitations in measurement equipment, where the measures are linearly dependent on each other. 1.4 Organization of the Thesis The remainder of the thesis is organized into the following chapters. Chapter 2 briefly explains regression analysis and applications of the EM algorithm in statistical modeling. It also covers essential concepts related to univariate and BVN distributions and their properties. Chapter 3 details the research methodology, focusing on adapting the EM algorithm for fitting SLR models to BVN data with missing or censored values. It further elaborates on the performance metrics used to evaluate the regression parameters. Chapter 4 presents the design and outcomes of comprehensive simulation studies. Chapter 5 shifts to real–world applications, applying the proposed method to actual datasets to validate its practical utility and effectiveness. Finally, Chapter 6 summarizes each chapter, discusses the limitations of the proposed method, and proposes directions for future work. 5 Chapter 2 Literature Review This chapter discusses the most common missing data patterns, censoring data patterns, an overview of regression, existing techniques for handling incomplete data, a brief explanation on the EM algorithm, existing literature on its applications, univariate normal distribution and the bivariate normal distribution. 2.1 Missing Data Patterns It is crucial to understand the nature and forms of missing data to effectively address the complexities of incomplete data. Rubin [1976] classified missing data into three major categories based on the nature of the missingness. Missing Completely At Random (MCAR) refers to the observations that are missing completely at random without any influence of the observed or missing values. It lacks a systematic pattern and occurs randomly. It is easier to handle missing data when the MCAR assumption is satisfied, and the results can be obtained without introducing significant bias into the analysis due to the complete randomness. For example, when recording videos on the phone, the phone randomly fails and restarts, but successfully records some videos. However, others are completely missing due to the failure. This is considered MCAR, as specific videos lost do not depend on the content of the video itself or any other factor related to the recording process. 6 Missing At Random (MAR) defines that the probability of missing is related to the observed data, but not directly related to the missing information itself. The MAR data has a specific pattern to the observed variables, which is also called ignorable missingness. When capturing videos on a phone with a low battery level of 1%, the recording often fails, resulting in missing video data. This scenario is an example of MAR. The failure of the video recording is more likely to occur when the battery is low (an observed factor), and the failure (missing data) itself does not influence the battery level. Missing Not At Random (MNAR) is the most challenging form of missing data. This means that the probability of missing is related to missing data itself, but not to the observed data. It is also called non–ignorable missingness. The MNAR observations occur due to unobserved variables. As an example, the phone camera fails to record videos more often when the battery is low, but does not keep track of the exact battery level that triggers the issue; the missing video data is MNAR as the missing information (critical battery threshold) directly influences the likelihood of missing videos. 2.2 Censoring Data Patterns A censored datum refers to a value known to lie within a specific interval. This occurrence is most frequent in survival analysis, clinical trials, environmental studies, and social sciences. In classical inference, the limitations in measuring instruments and specific reporting protocols are two major causes of censoring. Generally, measuring instruments have minimum and maximum measurement thresholds beyond which they cannot provide accurate measurements. This can cause censoring by reporting only the threshold instead of an exact measurement. Censoring has three major types, and understanding them is crucial in order to provide meaningful and accurate results. Suppose an observation falls below the minimum measurement threshold of an instrument. In that case, the minimum threshold is recorded as the measured value 7 and is considered as a Left Censored (LC) observation. Let x and y be random variables of a BVN distribution. The observations of x and y take the form     lc yi if yi ≥ clc xi if xi ≥ cx y o o yi = xi =   clc if yi < clc clc if xi < clc y y x x where i = 1, 2, ...n observations, ”o denotes observed data, with clc x for threshold in x and clc y for threshold in y for left censored data. The left censored observations are visually observable in the first scatter plot of Figure 2.1. The red and green vertical and horizontal lines, respectively represent censoring thresholds for variables x and y, set at -2. Observations falling below these thresholds (indicated by vibrant blue–green points) are considered left censored, meaning their exact values are unknown, but are at the threshold. For example, any observation below -2 for x or y is recorded as -2. Suppose an observation falls above the maximum measurement threshold of an instrument; it is considered as a Right Censored (RC) observation and this maximum threshold is recorded as the measured value as described below.     xi if xi ≤ crc yi if yi ≤ crc x y o o yi = xi =   crc if xi > crc crc if yi > crc x x y y rc where i = 1, 2, ...n observations with crc x for threshold in x and cy for threshold in y for right censored data. The right censored observations are visually observable using the second scatter plot in Figure 2.1. The censoring thresholds for variables x and y are set at 5, and the observations falling above the thresholds (indicated by vibrant blue–green points) are considered right censored. For example, any observation above 5 for x and y is recorded as 5. 8 Figure 2.1: Scatter plots of BVN data in x and y variables, depicting different types of censored observations. Vibrant blue–green points denote censored data, including left censored, right censored, and interval censored observations. Black points represent uncensored data points. The red vertical line indicates the threshold in the x variable, while the green horizontal line denotes the threshold in the y variable. Similar to one–sided censoring described above, measuring instruments may lack the precision needed for accurate measurements. In such instances, the instrument can only provide a range within which the true value falls rather than providing an exact measurement. We refer to this situation as an Interval Censored (IC) observation, where both the lower and upper bounds of the true value are known. Interval censoring can also occur due to specific reporting protocols. The simplest example of interval censoring is through a limited number of recorded digits: an instrument may record the value of the mass of an object to be 0.7kg, which means the true value is some unknown value between 0.65kg and 0.75kg. It is more formally described below.     ic ic ic ic ic ic [lyic , uic [lx , ux ] if lx ≤ xi ≤ ux y ] if ly ≤ yi ≤ uy o o xi = yi =   x i y i otherwise otherwise where i = 1, 2, ...n observations with lxic and uic x for lower and upper limits of interval 9 in x and lyic and uic y for lower and upper limits of interval in y for interval censored data. A scatter plot is illustrated in the third plot of Figure 2.1. The lower and upper limits of x and y are set at [2.5,4.5] and [0,2], respectively. The observations that fall within the limits (indicated in vibrant blue–green) are considered interval censored and only the lower and upper limits are recorded. 2.3 Introduction to Regression Regression is a fundamental technique used in statistical modeling with two primary objectives. First, regression modeling can be used to approximately quantify a relationship between dependent and independent variables. Second, regression models can be used to predict or estimate future trends and outcomes in machine learning [Maulud and Abdulazeez, 2020]. Regression analysis comprises different types of regression models, and the choice of the model depends on the nature of the data, research questions, and assumptions. Linear regression is often further characterized as either Simple Linear Regression (SLR) or Multiple Linear Regression (MLR). SLR involves a single independent variable and a single dependent variable. It explains the linear relationship between the independent variable and the dependent variable. MLR involves multiple independent variables and a single dependent variable. It explains the linear relationship between the multiple independent variables and single dependent variable [Maulud and Abdulazeez, 2020]. 2.3.1 Simple Linear Regression (SLR) The SLR model assumes a linear relationship between x and y which can be expressed mathematically as y = β0 + β1 x + ϵ, ϵ ∼ N (0, σ 2 ) where y represents the dependent variable, x represents the independent variable, β0 represents the intercept term, β1 represents the slope term and ϵ represents the error 10 term. The intercept provides the value of y when x is 0 and the slope indicates the change of y for a unit change of x. The predicted value of y(ŷ) can be obtained by βˆ0 + βˆ1 x for a given x, using βˆ0 and βˆ1 , which are the parameter estimates of the model. The error term ϵ provides the difference between the observed value of y and the ŷ [Maulud and Abdulazeez, 2020]. Moreover, linear regression modeling has four assumptions and violating these assumptions can lead to significant issues in the analysis. Kim [2019] explained them as follows. 1. Linearity: There should be an approximate linear relationship between the independent and dependent variables. It can be visually inspected using a scatter plot between two variables, and if it follows a linear pattern, the linearity assumption is satisfied. 2. Independence: Observations should be independent of each other, meaning that the residuals from one observation should not be correlated with residuals from another observation. We can use a “residuals vs. fitted values” plot to examine whether this assumption is met. Here, residuals are defined as the differences y and ŷ, while fitted values represent the ŷ for each observation. If the scatter plot has a randomly scattered pattern around zero, this suggests that the assumption is satisfied. Cohen et al. [2003] also have identified that this assumption can be satisfied if the sampling method is truly random. 3. Homoscedasticity: The variance of the residuals should be constant over fitted values. This assumption is met if the “residuals vs. fitted values” plot shows a constant spread around zero, without any pattern. 4. Normality of errors: The residuals are assumed to be normally distributed. This can be examined using a Normal Q–Q plot of residuals. The assumption is considered to be met if the residuals lie close to the diagonal line in the Normal Q–Q plot. 11 2.3.2 Least Squares (LS) Method The LS method estimates parameters of a model by minimizing the sum of the squared differences between y and ŷ. In SLR, the coefficients of the regression model β0 and β1 can be obtained by minimizing the RSS, which is given by RSS = n X e2i = i=1 n X i=1 n X (yi − ŷi ) = [yi − (β0 + β1 xi )]2 2 i=1 where x represents the independent variable and y represents the dependent variable for i = 1, 2, ..., n observations [Maulud and Abdulazeez, 2020]. The parameter estimates of β0 and β1 can be determined by taking the partial derivative of the RSS with respect to the parameters β0 and β1 and setting it to zero as given by n ∂RSS ∂ X = [yi − (β0 + β1 xi )]2 = 0 ∂β0 ∂β0 i=1 and n ∂RSS ∂ X = [yi − (β0 + β1 xi )]2 = 0. ∂β1 ∂β1 i=1 It gives the following formulas Pn (x − x̄)(yi − ȳ) Sxy Pn i = β̂1 = i=1 2 Sxx i=1 (xi − x̄) β̂0 = ȳ − β̂1 x̄ to obtain the LS slope coefficient and LS intercept coefficient, respectively. Mamat et al. [2019] expressed the formulas for Sxx and Sxy as P n n 2 X X ( ni=1 xi ) 2 2 Sxx = (xi − x̄) = xi − n i=1 i=1 P P n n X X ( ni=1 xi ) ( ni=1 yi ) Sxy = (xi − x̄)(yi − ȳ) = xi y i − . n i=1 i=1 The best fitting line for the SLR model based on the RSS objective can be represented as ŷ = βˆ0 + βˆ1 x. Austin and Hoch [2004] studied the influence of a censored independent variable on the estimation of regression coefficients. They used the LS method to obtain the regression parameters and compared the performance of the method with the partial 12 LS method and full–likelihood models. Monte Carlo simulations were conducted to assess the bias associated with each regression method. The findings revealed that the naive LS method, which ignored the censoring observations for model fitting introduced a significant bias in estimating the regression coefficient related to the variable. They highlighted the importance of considering censoring when estimating regression coefficients to obtain more accurate and unbiased results. The study of Haitovsky [1968] explored two different approaches to address the problem of missing data in regression analysis. The first approach involved applying the LS method to complete data by discarding all incomplete data points. The second approach incorporated the covariance between pairs of variables in which values are available for both variables. Simulations were conducted on eight different regression datasets, and various deletion patterns were applied to them. The estimates obtained from the two approaches were compared with the true regression parameters. They identified that the LS method performed superiorly in the majority of cases. However, significant bias in the estimates was observed when the proportion of missing data was high, and the missing data pattern was highly non–random. In addition, the LS method has been extended by many authors such as Buckley and James [1979], Miller [1976], Koul et al. [1981] to address the challenge of censoring in data. 2.3.3 Maximum Likelihood (ML) Method The parameters of the model that maximize the likelihood of observing the given data is employed in the ML estimation for linear regression [Gustavsson, 2012]. Soch et al. [2023] assumed normally distributed independent data and defined the likelihood function of a linear regression model as n Y  1 2 √ L(β0 , β1 , σ ) = exp − 2 (yi − β0 − β1 xi ) . 2σ 2πσ 2 i=1 2 1  (2.1) The log likelihood function is written as l(β0 , β1 , σ 2 ) = log L(β0 , β1 , σ 2 ) n n = − log(2π) − log(σ 2 ) 2 2 1 − 2 RSS. 2σ (2.2) 13 The derivatives of the log–likelihood function in equation 2.2 with respect to β0 and β1 are expressed as n ∂l(β0 , β1 , σ 2 ) 1 X = 2 (yi − β0 − β1 xi ) = 0 ∂β0 σ i=1 n ∂l(β̂0 , β1 , σ 2 ) 1 X = 2 (xi yi − β̂0 xi − β1 x2i ) = 0 ∂β1 σ i=1 and obtained the following parameter estimates of the SLR model by setting it to zero [Soch et al., 2023]. Pn (x − x̄)(yi − ȳ) Sxy Pn i β̂1 = i=1 = 2 Sxx i=1 (xi − x̄) (2.3) β̂0 = ȳ − β̂1 x̄ (2.4) Gustavsson [2012] performed regression analysis on log–normal heteroscedastic data using a ML approach and compared its performance with the LS method and the Weighted LS (WLS) method. They obtained unbiased estimates from all three methods, but with smaller standard errors for the ML and WLS methods compared to the LS method. Additionally, the ML method provided the highest power for tests of regression coefficients. The CIs for the estimated response were calculated, and it was observed that the ML and WLS methods provided correct CIs. Hussain et al. [2022] conducted a study on estimating the parameters of a linear regression truncated model using the ML method. They aimed to estimate the model parameters and the error variance accurately using this method. The dependent variable is assumed to follow a truncated t–distribution, with truncation occurring on both sides. They found that the ML method provides more accurate results compared to the LS method, based on the MSE. 2.4 Techniques for Handling Missing Data Various techniques exist to handle missing data, which can be mainly classified into two categories: traditional approaches and advanced (modern) approaches, as shown in Figure 2.1. Traditional approaches, such as listwise deletion (LD), pairwise deletion, 14 mean imputation, and regression imputation, are simple and commonly used methods that handle missing data depending on the missing data mechanism. LD is a method used to handle missing data by eliminating observations with missing values and considering only the complete observations. This is also known as complete case analysis. This approach is well known for providing unbiased parameter estimates only if the MCAR assumption is satisfied [Kang, 2013]. However, it is not a reasonable strategy for small sample sizes or if the data is not MCAR. The pairwise deletion method independently evaluates each feature in a dataset, considering all the observed values within each feature while disregarding any missing values [Strike et al., 2001]. The pairwise deletion method may result in less information loss compared to the LD method due to a reduction in the number of cases lost [Shylaja and Kumar, 2018]. Mean imputation, regression imputation are basic methods of missing data replacement. This approach replaces the unobserved data depending on the observed data. The mean imputation method involves replacing a missing value with the average of the observed values in the variable. Although the mean imputation method can preserve the sample size of the original data set, it can introduce bias if the data are not MCAR and if there is any relationship between the variables [Jadhav et al., 2019]. The regression imputation method is a better approach for missing imputation when compared with the mean imputation method, and it fits a statistical model, a regression model using the observed data and predicts the missing values using the fitted model [Annas et al., 2021]. This technique provides unbiased estimates for MAR data with fewer missing observations. Moreover, having few complete observations can lead to overfitting of the model, ultimately providing inaccurate predictions. These existing approaches are well–known for providing unbiased estimates only if the MCAR assumption is satisfied under small sample sizes [Annas et al., 2021, Hardt et al., 2012, Jadhav et al., 2019, Kang, 2013]. 15 Figure 2.2: Traditional and modern missing data handling methods [Shylaja and Kumar, 2018] Multiple Imputation (MI), Maximum Likelihood (ML) method, and EM algorithm are advanced statistical techniques that effectively handle missing data. The MI method involves generating multiple complete datasets using techniques such as Bootstrap and Monte Carlo Markov Chain (MCMC) and replacing the missing values with plausible values. Each complete dataset is then used for the analysis, and the results are pooled together to obtain a combined result [Austin et al., 2021]. This approach effectively addresses the uncertainty surrounding the actual values of the imputed variables, providing a more comprehensive understanding of the potential range of outcomes. ML estimation is a method used to estimate the parameters of a statistical model by maximizing the likelihood function, and this method provides consistent and asymptotically efficient estimates similar to the multiple imputation method. Expressions for log–likelihood function in the presence of complete data and missing data are explained by Allison [2012]. 16 2.5 Expectation–Maximization (EM) Algorithm Incomplete data makes the likelihood function complex, and the presence of missing data, censored data, and data originating from mixture distributions are some reasons for the complications in the likelihood function [Ng et al., 2011]. Dempster et al. [1977] proposed this EM algorithm, and it is identified as another ML–based method that can be used when the likelihood function becomes complicated. It estimates the parameters directly by maximizing the complete data log–likelihood function. It does so by iterating between the E–step and M–step of the algorithm. More information about the history of the EM algorithm is included in McLachlan et al. [2004]. The EM algorithm in the presence of incomplete data is defined as follows in Ng et al. [2011]. θ represents a vector of model parameters, fi (xi , yi ) represents the joint PDF of x and y for each observation i and complete data log–likelihood function l(θ; x, y) if x and y are completely observed variables is expressed as l(θ; x, y) = n X logfi (xi , yi |θ). i=1 Step 1: Initialization: Start the process by initializing the model parameters θ(0) . This can be set based on random values, prior information, or parameters obtained from the observed data. The initial model parameters are crucial as they directly affect the convergence of the algorithm and the accuracy of the parameter estimates. Step 2: Expectation step (E–step): Computes the expected value of the complete data log–likelihood function denoted as l(θ; x, y) with respect to the conditional distribution of the unobserved data conditioned on the observed data and current parameter estimates θ(k) , where k is the iteration number in the algorithm and expressed as Q(θ|θ(k) ) = Eθ(k) l(θ|x, y, θ(k) ). Step 3: Maximization step (M–step): Maximize the expected complete data log– likelihood to update the parameters computed in the E–step of the algorithm and expressed mathematically as θ(k+1) = argmax(Q(θ|θ(k) )). 17 Step 4: Iteration: The E step and M step are iterated repeatedly until the convergence. The convergence in the sequence can be estimated using a stopping rule such as θ(k+1) − θ(k) < τ where τ is a threshold value which is a smaller number. 2.5.1 Applications of the EM Algorithm in the Presence of Incomplete Data Applications of the EM algorithm have been established under different distributions, such as Exponential distributions, Poisson distributions, Gaussian distributions, Gaussian Mixture Models (GMM), and survival analysis models, such as Cox proportional hazards models. In addition, it has been used to handle different types of incomplete data, including missing data, truncated data, censored data, and both truncated and censored data. Although the term EM algorithm was proposed by Dempster et al. [1977], it has been identified that certain theoretical foundations of the algorithm were introduced by Orchard and Woodbury [1972] and by Reynolds and Sundberg [1976]. Dempster et al. [1977] provided a general definition for the EM algorithm when the complete data belong to the regular and curved exponential families. They discussed missing data in the multivariate normal, normal linear, and multinomial models. Atkinson [1992] proposed a hybrid EM and standard EM algorithm for a finite mixture of two univariate normal distributions when the data were right censored with varying degrees. Although it is commonly believed that the EM algorithm is computationally intensive and inaccurate, Atkinson [1992] found it to provide accurate and efficient results. He also observed failures in convergence under extreme right censoring, but the proposed hybrid EM algorithm demonstrated superior performance at any level of censoring. Lee and Scott [2012] fitted multivariate Gaussian Mixture Models (GMM) while handling truncated and censored data together using the EM algorithm. They implemented both the standard EM algorithm and a version of the EM algorithm adapted for truncation and censoring. This was illustrated in both the simulation studies and 18 the analysis of flow cytometry data. A flow cytometer is a tool used to diagnose diseases such as malignant lymphoma and acute leukemia. It has limitations in identifying the range of signal strength, which causes truncation and censoring in the data. The properties of truncated multivariate normal distributions were incorporated into the computation, and the simulation results showed that the proposed algorithm corrected the bias introduced by truncation and censoring, outperforming the standard EM algorithm. Ng et al. [2002] also conducted a study on obtaining parameter estimates for progressively censored log–normal and Weibull lifetime distributions using the EM algorithm. They observed that the EM algorithm converged more slowly compared to the Newton–Raphson method, as highlighted by Little and Rubin [2002]. However, Ng et al. [2002] emphasized the advantage of the EM algorithm in incorporating information from censored data in a seamless manner, a feature not observed in the Newton–Raphson method. Additionally, they highlighted its simple generalization to other types of censoring. Furthermore, Ferreira and Silva [2017] investigated a similar study on the Weibull distribution using the EM method, but with right censored data. Park and Lee [2012] developed a parameter estimation method using the EM and Monte Carlo Expectation–Maximization (MCEM) algorithms for normal, Laplace, and Rayleigh distributions in the presence of right censored data. The performance of the EM method for mixtures of elliptical distributions, assuming MCAR or MAR assumptions, was investigated by Mouret et al. [2023]. The simulation study demonstrated that the proposed method is robust to noise and non–Gaussian data. In addition, it has been applied to real–world datasets and concluded that the algorithm is highly competitive compared to traditional imputation methods. Hybrid censoring is a combination of both Type I and Type II censoring. Dube et al. [2011] conducted a study on hybrid censored log–normal distributions and obtained the parameter estimates of the model using the EM method. The asymptotic CIs were obtained using the FIM, and a real–world dataset was used to illustrate the applicability of the method. Moreover, they extended the study to address the problem of determining the optimum censoring scheme, which involves selecting from all possible hybrid censoring schemes. 19 The parameter estimates of a quasi–Lindley distribution were obtained by Kayid and Al-Maflehi [2022] for both right censored and uncensored data using the EM algorithm. They performed a simulation study, which resulted in better estimates from the EM method compared to the MLEs for both data scenarios. Additionally, a dataset related to waiting times of customers in banks was used to illustrate the method. Josse et al. [2018] estimated the mean of a BVN dataset in the presence of missing data in one of the two variables. This was implemented for a synthetic dataset under the MAR assumption. Furthermore, they designed the method to handle missing data in logistic regression models using a Monte Carlo version of the EM algorithm. 2.6 Univariate Normal Distribution Burkardt [2023] mathematically defined the general univariate normal distribution using the standard normal distribution. The x random variable with mean µx and x standard deviation σx can be transformed to a standard normal variable as z = x−µ σx where z ∼ N (0, 1). Let the random vector x = (x1 , x2 , ..., xn )T follows a normal distribution with mean µx and standard deviation σx . Then the PDF is   (x − µx )2 1 exp − fx (x; µx , σx ) = p 2σx2 2πσx2 where x ∈ R, µx ∈ R is a location parameter and σx > 0 is a scale parameter. The first and second moments of the univariate normal distribution are written as Z ∞ x · fx (x; µx , σx2 ) dx E(x) = −∞ E(x2 ) = Z ∞ x2 · fx (x; µx , σx2 ) dx −∞ and the variance, which is a measure of the mean squared difference between a randomly selected observation and the mean of a distribution, is written as Z ∞ V ar(x) (x − µx )2 · fx (x; µx , σx2 ) dx. −∞ The CDF or the probability that a random value is less than or equal to xu is expressed as  P (x ≤ xu ) = Φ xu − µ x σx  1 =p 2πσx2 Z xu   (x − µx )2 exp − dx. 2σx2 −∞ 20 The probability that a random value is greater than xl is expressed as     Z xu 1 xu − µ x (x − µx )2 =p dx P (x ≤ xu ) = Φ exp − σx 2σx2 2πσx2 −∞ and the probability that a random value falls between xl and xu is expressed as     xu − µ x xl − µ x P (xl < x < xu ) = Φ −Φ σx σx   Z xu (x − µx )2 1 exp − dx =p 2σx2 2πσx2 xl where Φ(.) is the CDF of the standard normal distribution. The PDF of standard normal distribution can be denoted as ϕ(.) 2.6.1 Right Censored Univariate Normal Data Ng et al. [2002] implemented the EM algorithm for right censored data in one variable and obtained the MLEs by incorporating first and second moments of the left truncated normal distribution, as derived by Balakrishnan and Cohen [1991]. They showed that the conditional distribution of right censored data follows a truncated distribution from the left. Let x be a random variable that follows normal distribution with mean µx and standard deviation σx , where observations are either right censored or left truncated at the value x = crc x . The first moment of a random variable x right censored at crc x , where αx =  cx −µx is expressed by Balakrishnan and Cohen [1991] according to the right cenσx  rc sored normal distribution as E(x|x > crc x , µx , σx ) = µx + σx and the conditional variance is obtained using " 2 V ar(x|x > crc x , µx , σx ) = σx 2.6.2 ϕ(αx ) 1 − Φ(αx ) ϕ(αx ) 1 + αx − 1 − Φ(αx )  ϕ(αx ) 1 − Φ(αx ) 2 # . Left Censored Univariate Normal Data Rueda and Garcia [2018] performed a study on estimating the process parameters with highly censored data. They have assumed that the variable x is normally distributed 21 with mean µx and variance σx2 with some observations being left censored at the value clc x . Also, Steiner and Mackay [2000] determined that the formulas for left censoring are similar for right censoring. The probability of x less than the censored point clc x is expressed as  lc  cx − µ x =P =Φ . σx  lc  cx −µx The conditional expected value for left censored x where γx = is expressed as σx P (x < clc x) = P  clc − µx x − µx < x σx σx   clc − µx Z≤ x σx E(x|x < clc x ) = µx − σ x  ϕ(γx ) . Φ(γx ) The conditional variance of left censored x is mostly similar to conditional variance of right censored x and can be expressed as " 2 V ar(x|x < clc x , µx , σx ) = σx ϕ(γx ) 1 − γx − Φ(γx )  ϕ(γx ) Φ(γx ) 2 # . The conditional expectation of x2 can be derived using the conditional expectation and conditional variance of x. 2.6.3 Interval Censored Univariate Normal Data Canavire-Bacarreza et al. [2023] modeled the probability that an observation falls within a range of two values. Let x be a random vector that is normally distributed with mean µx and standard deviation σx , with some observations recorded within the interval [lxic , uic x ]. The probability of an observation in x that falls within the range [lxic , uic x ] is expressed as  ic  lx − µx x − µx uic x − µx ≤ ≤ σx σx σx  ic   ic  u x − µx lx − µx =Φ −Φ σx σx P (lxic ≤ x ≤ uic x) = P where Φ(.) is the CDF. The conditional mean of interval censored x where ηx =   ic   ic lx −µx ux −µx and δ = is expressed as x σx σx E(x|lxic < x < uic x ) = µx − σx ϕ(δx ) − ϕ(ηx ) . Φ(δx ) − Φ(ηx ) 22 Similarly, the conditional variance of interval censored x is expressed as "  2 !# δx ϕ(δx ) − ηx ϕ(ηx ) ϕ(δx ) − ϕ(ηx ) 2 V ar(x|lxic < x < uic − . x ) = σx 1 − Φ(δx ) − Φ(ηx ) Φ(δx ) − Φ(ηx ) These equations can be used to obtain the second moment of the interval censored observations. 2.7 Bivariate Normal (BVN) Distribution The BVN distribution describes the joint distribution of two correlated normal random variables. Let (x, y) be a pair of random variables following a BVN distribution with means µx , µy ∈ R, standard deviations σx , σy > 0, and correlation ρ ∈ [−1, 1] [Roussas, 2003]. The joint PDF is given by   1 1 (x − µx )2 p fX,Y (x, y) = exp − 2(1 − ρ2 ) σx2 2πσx σy 1 − ρ2  (x − µx )(y − µy ) (y − µy )2 −2ρ + σx σy σy2 where x, y ∈ R. Roussas [2003] presented following properties of the BVN distribution. Z ∞Z ∞ fx,y (x, y) dx dy = 1 −∞ −∞ x ∼ N (µx , σx2 ), and y ∼ N (µy , σy2 ) If x and y follows a BVN distribution with the parameters µx , µy , σx , σy , ρ, the conditional distribution of y|x = x is also normally distributed and the conditional mean and conditional variance of y given x can be found in Chacko and Mathew [2020] and Roussas [2003] as follows. E(y|x = x) = µy + ρ σy (x − µx ) σx V ar(y|x = x) = σy2 (1 − ρ2 ). Thus, the distribution of y conditioned on x is y|(x = x) ∼ N (µy + ρ σy (x − µx ), σy2 (1 − ρ2 )). σx (2.5) 23 The second moment of the distribution of y conditioned on x can be determined using the above conditional mean and variance. The conditional distribution of x|y = y can be obtained using E(x|y = y) = µx + ρ σx (y − µy ) σy V ar(x|y = y) = σx2 (1 − ρ2 ). Thus, the distribution of x conditioned on y is x|(y = y) ∼ N (µx + ρ σx (y − µy ), σx2 (1 − ρ2 )). σy (2.6) The second moment of the distribution of x conditioned on y can be determined using the conditional mean and variance. Roussas [2003] presented the joint PDF using the marginal density of x and conditional density of y|x as 2 (y−bx) (x−µx )2 − 1 1 − 2 fx,y (x, y) = p e 2σx2 · q e 2(σy (1−ρ2 )) 2 2 2 2πσx 2π(σy − ρ ) − y where b = µy + ρσ (x − µx ) = fx (x) · fy|x (y|x) and fx (x) = √ 1 2 e σx 2πσx (x−µx )2 2 2σx , fy|x (y|x) = (y−bx)2 − 2 (1−ρ2 )) 1 2(σy e . It is incorporated to derive the formula for covariance as 2π(σy2 −ρ2 ) √ Z ∞Z ∞ E(XY ) = xyfX,Y (x, y) dx dy  Z ∞ yfY |X (y|x) dy dx xfX (x) = −∞ −∞   Z ∞ σy = xfX (x) µy + ρ (x − µx ) dx σx −∞ Z ∞ xfX (x) (µy + ρσy (x − µx )) dx = Z−∞ ∞ −∞ −∞ = µx µy + ρσx σy . Since E(x) = µx , E(y) = µy , V ar(x) = σx2 , V ar(y) = σy2 , Roussas (2003) have obtained the covariance between x and y by substituting it as Cov(x, y) = E(xy) − E(x)E(y) = µx µy + ρσx σy − µx µy = ρσx σy . In addition, Roussas [2003] found the MLEs of the model parameters as follows. Set θ = (µx , µy , σx , σy , ρ) for the convenience in writing and the likelihood function is 24 written as follows for the BVN distribution of x = (x1 , ..., xn ), y = (y1 , ..., yn ) for sample size n. !n ! n 1 1X p L(θ|x, y) = qi exp − 2 i=1 2πσx σy 1 − ρ2  2  2  yi −µy 2ρ(xi −µx )(yi −µy ) xi −µx 1 where qi = 1−ρ2 + . The log–likelihood function − σx σx σy σy is written as n n n n n 1X 2 2 2 l(θ|x, y) = − log(2π) − log σx − log σy − log(1 − ρ ) − qi . 2 2 2 2 2 i=1 (2.7) The partial derivatives of the l(θ; x, y) with respect to the the parameters µx , µy , σx , σy , ρ are written as ∂l(θ|x, y) n nρ = 2 (x̄ − µx ) − (ȳ − µy ), 2 ∂µx σx (1 − ρ ) σx σy (1 − ρ2 ) ∂l(θ|x, y) n nρ = 2 (ȳ − µy ) − (x̄ − µx ), 2 ∂µy σy (1 − ρ ) σx σy (1 − ρ2 ) Pn P 2 ρ ni=1 (xi − µx )(yi − µy ) ∂l(θ|x, y) n i=1 (xi − µx ) =− 2 + − , ∂σx2 2σx 2σx4 (1 − ρ2 ) 2σx3 σy (1 − ρ2 ) Pn P 2 ρ ni=1 (xi − µx )(yi − µy ) n ∂l(θ|x, y) i=1 (yi − µy ) =− 2 + − , ∂σy2 2σy 2σy4 (1 − ρ2 ) 2σx σy3 (1 − ρ2 ) P P ρ ni=1 (xi − µx )2 ρ ni=1 (yi − µy )2 ∂l(θ|x, y) nρ = − − ∂ρ 1 − ρ2 σx2 (1 − ρ2 ) σy2 (1 − ρ2 ) P (1 + ρ)2 ni=1 (xi − µx )(yi − µy ) . + σx σy (1 − ρ2 )2 The formulas for ML estimates for µx and µy can be obtained by setting ∂l(θ|x,y) = ∂µx ∂l(θ|x,y) = 0 as ∂µy n P µ̂x = x̄ = n P xi i=1 n and µ̂y = ȳ = yi i=1 n . (2.8) The ML estimates for σx , σy , and ρ can be obtained by setting ∂l(θ|x,y) = ∂l(θ|x,y) = ∂σ 2 ∂σ 2 x y ∂l(θ|x,y) = 0 and substituting the expressions obtained for µx and µy . ∂ρ v  n 2 uP P v u n 2 u n x u i X u1  i=1 xi  u i=1 2 t   σ̂x = (xi − x̄) = u t n − n  n (2.9) i=1 v  n 2 uP P v u n 2 u n y u i u1 X  i=1 yi  u i=1 2 t  u σ̂y = (yi − ȳ) = t −  n  n i=1 n (2.10) 25 n P σ̂xy ρ̂ = σ̂x σ̂y where σ̂xy = n P xi y i i=1 n − n P xi i=1 n · yi i=1 n (2.11) Although closed–form solutions for the ML estimates exist, estimating them is not feasible due to their dependence on x and y, which contain missing or censored observations in the data. To overcome this problem, we are seeking the EM based procedure. 26 Chapter 3 Methodology This chapter details the design of the EM algorithm to fit an SLR model for BVN data under various incomplete data scenarios. The first section offers a general overview for any type of incomplete data, while the second section focuses specifically on the missing data scenario. 3.1 EM Algorithm for the BVN Distribution The log–likelihood function provided in 2.7 based on the complete data (x, y) can be rewritten as n n n n log(2π) − log σx2 − log σy2 − log(1 − ρ2 ) 2 2 2  2n n P 2 P 2  i=1 xi − 2µx i=1 xi + µx 1  − 2(1 − ρ)2 )  σ2 l(θ; x, y) = − x 2ρ( n P x i y i − µy i=1 − n P xi − µ x i=1 σx σy  2 2 yi − 2µy yi + µy  i=1 i=1 . +  σ2 n P n P y 27 n P i=1 y i + µx µy (3.1) Pn The sufficient statistics of the log–likelihood function in equation 3.1 are i=1 xi , Pn 2 Pn Pn 2 Pn i=1 xi , i=1 yi , i=1 yi , and i=1 xi yi . However, it is challenging to estimate the parameter estimates of the model due to the presence of incomplete data in x and y. In particular, some x or y values are observed while some are missing or censored. Hence, the EM algorithm is incorporated to determine the optimal parameters by maximizing E(l(θ; x, y)). It achieves this by replacing the missing values with their respective expected values. The E–step of the algorithm compute conditional expected values for the incomplete observations based on the specific type of incompleteness present in the data. Specifically, E(x), E(x2 ) E(y), E(y 2 ), E(xy) are computed considering the distributional assumptions, missing nature, and current parameter estimates. Suppose we have a BVN dataset with (xi , yi ) for (i = 1, 2, .., n) , where the data is categoinc o ) to (xop , ypinc ) have ) are fully observed; (xom+1 , ym+1 rized as follows: (xo1 , y1o ) to (xom , ym inc o o observed x and incomplete y; (xinc p+1 , yp+1 ) to (xq , yq ) have incomplete x and obinc inc inc served y; and (xinc q+1 , yq+1 ) to (xn , yn ) have incomplete x and incomplete y data, where 0 < m < p < q < n, “o” denotes observed data and “inc” denotes incomplete data, which can be either missing or censored. The initial parameters (0) (0) (0) (0) θ(0) = (µx , µy , σx , σy , ρ(0) ) are computed in the first step by ignoring the incomplete cases in the dataset. Index 1 ··· m m+1 ··· p p+1 ··· q q+1 ··· n x xo1 ··· xom xom+1 ··· xop xinc p+1 ··· xinc q xinc q+1 ··· xinc n y y1o ··· o ym inc ym+1 ··· ypinc ypo ··· yqo inc yq+1 ··· yninc Table 3.1: Data structure for variables x and y Therefore, the E–step at the k th iteration compute (k) s1 = Eθ(k) n X ! inc xi xoi , yio , xinc i , yi (3.2) i=1 (k) s2 = Eθ(k) n X ! inc yi xoi , yio , xinc i , yi (3.3) i=1 s21 (k) = Eθ(k) n X ! inc x2i xoi , yio , xinc i , yi (3.4) i=1 28 n X s22 (k) = Eθ(k) ! inc yi2 xoi , yio , xinc i , yi (3.5) i=1 (k) s12 = Eθ(k) n X ! inc xi yi xoi , yio , xinc . i , yi (3.6) i=1 (k) (k) Here s1 denotes the expected value of sum of x statistic, s2 denotes the expected value of sum of y statistic, s21 (k) denotes the expected value of sum of x2 statistic, (k) s22 (k) denotes the expected value of sum of y 2 statistic, and s12 denotes the expected value of the sum of the product of x and y statistics in the k th iteration. Maximization is the next step of the algorithm and the sufficient statistics calculated in the E–step, equations 3.2–3.6 are maximised. The parameters are then updated accordingly. The formulas for MLEs of complete data are obtained from formulas 2.8–2.11 in section 2.7. n P µ̂x = n P xi i=1 yi i=1 , µ̂y = , n n v v n n uP uP u x2 u y2 t t i i σ̂x = i=1 − µ̂2x , σ̂y = i=1 − µ̂2y , n n n P xi y i i=1 σ̂xy = − µ̂x µ̂y n (3.7) Since the complete data sufficient statistics in equation 3.7 cannot be computed directly due to the unobserved observations from (m + 1, ..., n), the expectations in equations 3.2–3.6 obtained in the E–step are substituted as the complete data sufficient statistics to compute the new estimates at the next iteration. (k) s1 (k+1) µ̂x = n (3.8) (k) s µ̂(k+1) = 2 y n r s21 (k) (k+1) σ̂x(k+1) = − (µx )2 n r s22 (k) (k+1) 2 σ̂y(k+1) = − (µy ) n (3.9) (3.10) (3.11) (k) (k+1) σ̂xy = s12 − µ(k+1) · µy(k+1) x n (3.12) 29 (k+1) The estimate for ρ is computed using σ̂xy 3.12 (k+1) , σ̂x (k+1) , and σ̂y in equations 3.10 - (k+1) σ̂xy ρ̂(k+1) = (k+1) (k+1) σ̂y (3.13) σ̂x The E–step computes the sufficient statistics, starting with the initial parameters θ(0) . M–step substitutes the new parameter values to equations 3.8–3.13 and calculate the new parameter values, etc. This iterates until convergence and the estimates of intercept (β0 ) and slope (β1 ) of the SLR model derived in equations 2.3 and 2.4 of section 2.3.3 is obtained using the maximized parameters. (k+1) β̂1 = 3.2 σ̂xy 2(k+1) σ̂x and β̂0 = µ̂(k+1) − β̂1 µ̂(k+1) x y (3.14) EM Algorithm for Missing Observations This section explains the computation of the E–step and the M–step when the data contain missing observations. The computations for right censoring, left censoring, and interval censoring are detailed in Appendix A. 3.2.1 E–Step The E–step of the EM algorithm requires estimating the expected values of the randomly missing observations conditioned on the observed data and the current parameter estimates. Datasets with BVN random variables (x, y) are consisted of 4 scenarios, where “o” denotes observed data and “mis” denotes missing data as shown in Table mis 3.2. The scenario 4, observations of (xmis ) are ignored from the computation as i , yi they does not provide any information about the data. Scenario 1 : xi observed, yi observed The E–step involves computing the sufficient statistics expressed in equations 3.2–3.6 for a set of observations in which xi and yi both are observed. It can be expressed 30 Scenario Index x y 1 1, . . . , m xoi yio 2 m + 1, . . . , p xoi yimis 3 p + 1, . . . , q xmis i yio 4 q + 1, . . . , n xmis i yimis Table 3.2: Scenarios with observed and missing data for x and y directly as summations of the observed values as ! m m X X o o o Eθ(k) xoi xi xi , yi = m X Eθ(k) ! yio xoi , yio = m X i=1 Eθ(k) m X x2i (o) xoi , yio = x2i (o) (3.17) yi2 (o) (3.18) xoi .yio . (3.19) m X ! yi2 (o) xoi , yio = m X i=1 Eθ(k) (3.16) i=1 m X m X yio i=1 ! i=1 Eθ(k) (3.15) i=1 i=1 i=1 ! xoi yio xoi , yio m X = i=1 i=1 Scenario 2 : xi observed, yi missing The E–step involves computing the sufficient statistics expressed in equations 3.2–3.6 when xi is observed while yi is missing in a set of observations. The expected value of xi when it is observed is the value itself. However, the expected value of yimis need to be calculated conditioned on the value of xoi . The computation of the E–step is expressed as Eθ(k) p X ! xoi xoi , yimis = i=m+1 Eθ(k) p X i=m+1 Eθ(k) xoi i=m+1 ! yimis xoi , yimis p X = p X E(yimis |xoi = x, θ(k) ) (3.20) i=m+1 m X i=1 ! x2i (o) xoi , yimis = p X x2i (o) i=m+1 31 m X Eθ(k) = Eθ(k) E(yi2 (mis) |xoi = x, θ(k) ) (3.21) i=m+1 i=1 m X p X ! yi2 (mis) xoi , yimis ! xoi yimis xoi , yimis = p X xoi · E(yimis |xoi = x, θ(k) ). i=m+1 i=1 E(yimis |xoi = x, θ(k) ) and E(yi2 (mis) |xoi = x, θ(k) ) in equations 3.20 and 3.21 can be computed for each missing observation using the conditional distribution of BVN distribution. The conditional mean and variance provided by Chacko and Mathew (2020) (equation 2.5) are incorporated to obtain the expectations of yi conditioned on xi . (k) mis o (k) (k) (k) σy E(yi |xi = x, θ ) = µy + ρ (xoi − µ(k) x ) (k) σx (3.22) Var(yimis | xoi = x, θ(k) ) = σy(k)2 (1 − ρ(k)2 ) (3.23) The second moment of yimis conditioned on xoi based on current parameter estimates θ(k) can be obtained using equations 3.22 and 3.23 as E(yi2 (mis) |xoi = x, θ(k) ) = (E(yimis |xoi = x, θ(k) ))2 + V ar(yimis |xoi = x, θ(k) ). Scenario 3: xi missing and yi observed This is similar to scenario 2 and the sufficient statistics need to compute when xi is missing while yi is observed in a set of observations. The expected values of yi when it is observed is the yi itself. The missing xi is replaced with the expected values of xmis i conditioned on the value of yio . The computation of the E–step for the third scenario includes the formulas Eθ(k) q X ! o xmis xmis i i , yi = i=p+1 Eθ(k) q X o (k) E(xmis ) i |yi = y, θ q X ! o yio xmis i , yi = Eθ(k) ! o x2i (mis) xmis i , yi yio = p X E(x2i (mis) |yio = y, θ(k) ) (3.25) i=p+1 i=p+1 Eθ(k) q X i=p+1 i=p+1 q X (3.24) i=p+1 q X i=p+1 ! o yi2 (o) xmis i , yi = q X yi2 (o) i=p+1 32 q X Eθ(k) q X ! o mis o xmis i yi xi , yi = o (k) yio · E(xmis ). i |yi = y, θ i=p+1 i=p+1 o (k) E(xmis ) and E(x2i (mis) |yio = y, θ(k) ) in equations 3.24 and 3.25 can be i |yi = y, θ computed for each missing observation using the conditional distribution of BVN distribution. The conditional mean and variance provided by Chacko and Mathew [2020] equation 2.6 are incorporated to obtain the expectations of xmis conditioned i on yio . (k) (k) σx (k) (k) mis o E(xi |yi = y, θ ) = µx + ρ (y o − µ(k) y ) (k) i σy (3.26) o (k) V ar(xmis ) = σx(k) 2(1 − ρ(k)2 ) i |yi = y, θ (3.27) conditioned on yio and current parameter estimates θ(k) The second moment of xmis i can be obtained using equations 3.26 and 3.27 as o (k) 2 o (k) E(x2i (mis) |yio = y, θ(k) ) = (E(xmis )) + V ar(xmis ). i |yi = y, θ i |yi = y, θ Computing the Total Expectation for Missing Data The computations in E–step in equations 3.2 - 3.6 are obtained by summing up the expectations obtained in the 3 scenarios. Thus, ! p n m X X X (k) o o mis mis o s1 = Eθ(k) xi xi , yi , xi , yi = xi + xoi i=1 i=1 q i=m+1 X + (3.28) o (k) E(xmis ) i |yi = y, θ i=p+1 (k) s2 = Eθ(k) n X ! mis yi xoi , yio , xmis i , yi = i=1 m X p X yio + i=1 q + E(yimis |xoi = x, θ(k) ) i=m+1 X (3.29) yio i=p+1 s21 (k) = Eθ(k) n X ! mis x2i xoi , yio , xmis i , yi = i=1 m X x2i (o) + i=1 p + X p X x2i (o) i=m+1 (3.30) E(x2i (mis) |yio = y, θ(k) ) i=m+1 33 s22 (k) = Eθ(k) n X ! mis yi2 xoi , yio , xmis i , yi = i=1 m X yi2 (o) + E(yi2 (mis) |xoi = x, θ(k) ) i=m+1 i=1 q + p X X yi2 (o) i=p+1 (3.31) (k) s12 = Eθ(k) n X ! xi yi xoi , yio = i=1 m X p xoi · yio + i=1 q + X X xoi · E(yimis |xoi = x, θ(k) ) i=m+1 (3.32) o (k) yio · E(xmis ). i |yi = y, θ i=p+1 3.2.2 M–step M–step is the next step of the algorithm and the sufficient statistics calculated in the E–step equations 3.28–3.32 are maximised. The parameters are then updated accordingly. The formulas for MLEs of complete data are obtained from formulas 2.8–2.11 in section 2.7. The E–step and M–step of the EM algorithm for right censored, left censored, and interval censored observations are presented in the Appendix A. 3.3 Performance Metrics to Evaluate the Regression Estimates The performance of the regression estimates in the four simulation scenarios are evaluated using several widely used performance metrics. These metrics include bias, empirical variance, asymptotic variance, Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Coverage Probability (CP), and Relative Root Mean Squared Error (RRMSE) computed based on asymptotic values. Each of these metrics provide insights about the accuracy and reliability of the parameter estimates for the three methods under consideration. Bias: measures the difference between the expected (or average) value of the estimates 34 and the true value of the parameter being estimated. It indicates the systematic error in the model. Mathematically, the bias of βˆ0 can be expressed as Bias(βˆ0 ) = E(βˆ0 ) − β0 . The empirical estimate of E(βˆ0 ) is obtained by taking the average of the βˆ0 obtained from R total number of simulated datasets and it is written as R 1X β̂01 + β̂02 + . . . + β̂0R ˆ = β̂0j . E(β0 ) = R R j=1 The βˆ0 is considered unbiased if E(βˆ0 ) = β0 . If this condition is not met, the estimator has a positive or negative bias, implying that it tends to overestimate or underestimate the true parameter on average. Therefore, a low bias is preferable as it signifies that the model’s predictions, on average, are closer to the true parameter value. Empirical Variance: is a statistical measure used to quantify the spread of the estimates around their average value. It provides valuable information about the variability or uncertainty of the estimates within the simulation, indicating how much individual estimates deviate from the average of the estimates. It can be written mathematically as R Var(βˆ0 ) = 1X (β̂0j − E(βˆ0 ))2 . R j=1 The low empirical variances indicate more precise and reliable estimates with minimum variability, while high empirical variance suggests greater uncertainty and variability among the estimates, leading to less precise and less reliable conclusions. Asymptotic Variance: describes the approximation of the variance for large samples (n → ∞) when the central limit theorem is employed. The asymptotic variance of intercept (βˆ0 ) and slope (βˆ1 ) estimates can be obtained using   R  X x̄2j 1 1 2 Var(βˆ0 ) = σ + Pn 2 R j=1 j n i=1 (xij − x̄j ) (3.33)  R  σj2 1X ˆ Pn Var(β1 ) = 2 R j=1 i=1 (xij − x̄j ) (3.34) where n is the number of observations in each dataset, xij is the ith observation of x in j th dataset, x̄j is average of x observations in j th dataset, and σj2 is the variance of 35 1 error term for j th dataset. σj2 = n−2 Pn i=1 (yij − ŷij ) 2 where yij and ŷij are ith observed and predicted y in the j th dataset. However, the formulas 3.33 and 3.34 cannot be directly used to calculate the asymptotic variance of βˆ0 and βˆ1 for the proposed methodology due to incomplete observations present in the data. The variance of 2 error term for j th dataset of the proposed method (σj,proposed ) is expanded and re– written by substituting the ŷij by βˆ0 + βˆ1 xij as n 1 X (yij − ŷij )2 n − 2 i=1 n 2 1 X = yij − (βˆ0 + βˆ1 xij ) n − 2 i=1 n  2 2 1 X 2 = yij + βˆ0 + βˆ1 x2ij − 2βˆ0 yij − 2βˆ1 xij yij + 2βˆ0 βˆ1 xij n − 2 i=1 " n n n n X X X 2 2X 2 1 2 ˆ ˆ ˆ = β0 + β1 xij − 2β0 yij y + n − 2 i=1 ij i=1 i=1 i=1 # n n X X −2βˆ1 xij yij + 2βˆ0 βˆ1 xij . 2 σj,proposed = i=1 (3.35) i=1 The sufficient statistics in the equation 3.35 are substituted with the expected values of the sufficient statistics to obtain the asymptotic variances of the estimates in the proposed methodology as " n n n X X 2 2X 1 2 2 2 ˆ ˆ ˆ σj,proposed = E(yij ) + nβ0 + β1 E(xij ) − 2β0 E(yij ) n − 2 i=1 i=1 i=1 # n n X X −2β1 E(xij yij ) + 2βˆ0 βˆ1 E(xij ) . i=1 Similarly, x̄j and i=1 Pn 2 i=1 (xij − x̄j ) are expanded and re–written by replacing the suf- ficient statistics using the expected value of the sufficient statistics and obtained the final formula as Var(βˆ0 ) = R 1X   Pn i=1 E(xij )  2 2 n σj,proposed 1 + P  Pn 2 n 2 R j=1 n i=1 [E(xij )] E(x ) − ij i=1 n Var(βˆ1 ) = R 1X " 2 σj,proposed # . P R j=1 Pn E(x2 ) − ni=1 [E(xij )]2 ij i=1 n The asymptotic variance of parameters for the BVN distribution can be derived by computing the inverse of the FIM. First, compute the score function U (θ), which is 36 the gradient of the log-likelihood function with respect to θ, given by U (θ) = ∂ℓ(θ) . ∂θ Next, compute the Hessian matrix H(θ) by taking the second-order partial derivatives 2 ∂ ℓ(θ) of the log-likelihood function, H(θ) = ∂θ∂θ T . Then, obtain the FIM by taking the negative expected value of the Hessian matrix, I(θ) = −E[H(θ)]. Finally, obtain the asymptotic variance of the parameters by inverting the FIM; the diagonals of I(θ)−1 provide the asymptotic variance of each parameter. The upper diagonal elements of the FIM for the BVN distribution can be expressed considering its symmetry [Amini and Ahmadi, 2013] as  1 σx2 (1−ρ2 )   ρ − σx σy (1−ρ 2)  2   ∂ ℓ(θ)  I(θ) = −E = n 0  ∂θ∂θT   0  0 ρ − σx σy (1−ρ 2) 0 0 1 σy2 (1−ρ2 ) 0 0 0 2−ρ2 σx2 (1−ρ2 ) −ρ2 σx σy (1−ρ2 ) 0 −ρ2 σx σy (1−ρ2 ) 2−ρ2 σy2 (1−ρ2 ) 0 −ρ σx (1−ρ2 ) −ρ σy (1−ρ2 ) 0       −ρ . σx (1−ρ2 )   −ρ  σy (1−ρ2 )  0 1+ρ2 (1−ρ2 )2 Mean Squared Error (MSE): MSE of an estimator βˆ0 measures the mean squared difference between the estimator βˆ0 and true parameter β0 . It is a metric used to assess the accuracy of estimates. The MSE of an estimator can be obtained by taking the summation of the variance of the estimator and the square of its bias. A lower MSE indicates a better fit, while a higher MSE indicates a poorer fit. MSE(βˆ0 ) = E(βˆ0 − β0 ) = Var(βˆ0 ) + Bias2 (βˆ0 ) Root Mean Squared Error (RMSE): RMSE of an estimator is the square root of the MSE of the estimator and can be defined as q ˆ RM SE(β0 ) = M SE(βˆ0 ). Confidence Interval (CI): CI is a statistical measure that is used to make inferences about a population. The narrower the CI, the greater the reliability of the estimator. The formulas for calculating the CI are expressed as CI = point estimator ± critical value × √ variance of the estimator. The critical value is a value corresponding to a confidence level from the distribution. The z–value is used for standard normal distribution. As an example, the lower and 37 upper bound of a CI of βˆ0 for 95% CI can be written as   q q ˆ ˆ ˆ ˆ CI = β0 − 1.96 × V ar(β0 ), β0 + 1.96 × V ar(β0 ) . Coverage Probability (CP): CP is defined as the probability that a CI will contain the true value of the parameter of interest. It measures the effectiveness of the CI in capturing the true parameter value over repeated sampling. Typically, we aim for this probability to match the nominal confidence level chosen for constructing the interval of 95% CI. Relative Root Mean Squared Error (RRMSE): measures the performance of one model relative to the other model. RRMSE of model A to that of model B can be defined as RRM SE(β̂A , β̂B ) = RM SEβ̂A RM SEβ̂B . If the RRMSE is greater than 1, it implies the superior performance of β̂ of model A compared to model B. If the RRMSE is closer to 1, it implies a similar performance between the two estimates. Akaike Information Criterion (AIC): AIC is a statistical method used to determine the best model from multiple models fitted to a given dataset. It selects the model that best explains the variance in the dependent variable while using fewer parameters, thereby avoiding overfitting by penalizing complexity. The likelihood function for linear regression model under the assumption that the errors are normally distributed with mean 0 and variance σ 2 can be expressed as 2.1. The corresponding log-likelihood function is expressed in 2.2. Given the log likelihood function ℓ(β0 , β1 , σ 2 ), the AIC can be computed using the formula AIC = 2k − 2ℓ(β0 , β1 , σ 2 ) where n is the number of observations in the data and k is the number of parameters in the model. However, when dealing with incomplete data (such as missing or censored observations), the AIC needs to be adjusted to account for the effective sample size. For methods that handle incomplete data, such as the naive method which only uses complete observations, the AIC should be scaled. This scaling is done by multiplying the AIC by the ratio of the total number of observations n to the number of complete 38 observations m. This adjustment corrects for the reduced effective sample size and ensures a fair comparison with models using all available data. The adjusted AIC formula for a subset of data can thus be written as AICadjusted = 2k − 2ℓ(β0 , β1 , σ 2 ) × n . m This adjustment is important because it ensures the actual amount of data used in the estimation process, thereby providing a more accurate measure of model quality. 39 Chapter 4 Simulation Studies This chapter covers the design of the simulation study, presents the extensive simulation results, and discusses the outcomes for each type of incompleteness. 4.1 Simulation Design Let (x, y) be the BVN random variables with means µx and µy , standard deviations σx and σy , and correlation ρ between x and y variables, respectively. The model is       x µx σx2 ρσx σy  .   ∼ N   ,  (4.1) 2 ρσx σy σy y µy We generated R = 500 simulated datasets from BVN model given in 4.1, by setting µx = 3, µy = −1, σx = 10, σy = 5 for different correlations ρ = 0.2, 0.4, 0.6, 0.8, and sample sizes n = 50, 100, and 500. Then, we introduced the four different types of incomplete data separately to R = 500 simulated datasets considering three total incomplete proportions p = 0.1, 0.3, and 0.5, which is equally split between x and y. The possible combinations of datasets for different parameters are shown in Table 4.1 and a total of 12 data scenarios were evaluated for each sample size, resulting in 36 total combinations across all three sample sizes. More precisely; for the given parameter values (µx = 3, µy = −1, σx = 10, σy = 5) and for sample size n = 50, ρ = 0.2, we generated a total of 3 datasets using proportion p = 0.1, 0.3, and 0.5, 40 respectively. And each set of data is simulated R = 500 times. Similarly, for each set of correlation values and proportion values, we generated a total of 12 datasets for n = 50, 12 datasets for n = 100, and 12 datasets for n = 500. The simulation is done using R software and the rmvnorm function from the mvtnorm package in R [R Core Team, 2022] is used to generate the data. Sample Size (n) Correlation (ρ) Proportion (p) 50 0.2 0.1, 0.3, 0.5 0.4 0.1, 0.3, 0.5 0.6 0.1, 0.3, 0.5 0.8 0.1, 0.3, 0.5 0.2 0.1, 0.3, 0.5 0.4 0.1, 0.3, 0.5 0.6 0.1, 0.3, 0.5 0.8 0.1, 0.3, 0.5 0.2 0.1, 0.3, 0.5 0.4 0.1, 0.3, 0.5 0.6 0.1, 0.3, 0.5 0.8 0.1, 0.3, 0.5 100 500 Table 4.1: Simulation details for different sample sizes and correlations for a single proportion of x and y. Missingness is introduced to the datasets by considering a missing proportion p. For a dataset with n observations, the missing proportion is evenly split between x and y, resulting in nmissing = 0.5p × n, where nmissing is the number of missing values for each variable. A random set of nmissing indices is generated from a sequence of 1 to n to determine which observations will be missing. The left censored observations in x and y are introduced by determining the quantiles corresponding to 0.5p. This process creates thresholds to identify the values lc below these percentiles. Specifically, the threshold clc x for x, where P (x ≤ cx ) = −1 0.5p, is calculated using the inverse CDF of x, i.e., clc x = Φ (0.5p). Values below this threshold are replaced with clc x to indicate they are left censored. Similarly, left 41 censored observations are introduced for y using a threshold clc y calculated in the same manner. The quantile function in R is used to obtain the threshold values. Right censored data is introduced in a similar way. The right censored observations in x and y are determined using the quantile method. The threshold crc x for x, rc −1 where P (x ≥ crc x ) = 0.5p, is calculated using the inverse CDF of x; cx = Φ (1−0.5p). Values above this threshold are replaced with crc x to indicate they are right censored. Similarly, right censored observations are introduced for y using a threshold of crc y calculated in the same manner. The process for introducing interval censored data in x and y follows a similar methodology. Initially, random probabilities are generated to determine the points at which the lower bound of interval censoring occurs in x and y. These probabilities are constrained such that their sum, along with the desired censoring percentage for x and y, each denoted as 0.5p, does not exceed 1. The lower bounds for interval censored observations are then calculated using quantiles based on these random probabilities. Subsequently, upper bounds for interval censoring are computed using quantiles based on the sum of random probabilities and the desired censoring percentages of 0.5p for both x and y. The proposed model is compared with a naive model and a benchmark model. First, a SLR model is fitted to the complete dataset using the CLS method, which is the benchmark model. Then, the LS estimates of a naive model are obtained by fitting a SLR model for a subset of complete data, considering only the fully observed data. The parameter estimates of the EM based model, CLS model, and naive model are compared to assess the performance of the proposed approach. In addition, a comprehensive study is conducted to determine the influence and behaviour of sample size, incomplete data proportion, and correlation on the estimation process. The parameter estimates of the BVN distribution µx , µy , σx , σy , ρ are estimated and presented in Appendix B. 42 4.2 Simulation Results This section presents the simulation results for four different types of incomplete data: missing, right censoring, left censoring, and interval censoring. 4.2.1 Results in the Presence of Missing Observations The true intercept β0 and slope β1 are computed using the true parameters of the BVN model that is used to generate the data, using the formulas β0 = µy − β1 µx and β1 = ρ σy . σx Figure 4.1 illustrate the boxplots of parameter estimates for a SLR model with n = 100, ρ = 0.2 for the CLS models, the naive models, and the proposed models. The averages of the estimates from all three models are approximately centered around the true values of the intercept (-1.3) and slope (0.1), as indicated by the horizontal red dashed lines in these figures. As the proportion of missing data increases, the variability of the estimates also increases for all models. The boxplots displaying parameter estimates for the BVN distribution for this combination are included in Appendix A. Figure 4.1: Boxplots for estimates of intercept (βˆ0 ) and slope (βˆ1 ), n = 100 and ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 43 The asymptotic variances of the intercept (βˆ0 ) and slope (βˆ1 ) estimates for the CLS, naive, and proposed models are depicted in Figure 4.2 (The red line is always positioned under the blue line whenever it is obscured). These variances are evaluated across varying missing proportions and for the three different sample sizes. The results indicate that the proposed estimates consistently exhibit lower and more stable asymptotic variances compared to the naive estimates; particularly as the proportion of missing data increases. For smaller sample sizes n = 50 and n = 100, the naive approach shows a significant increase in variance for the intercept and slope estimates, at higher missing levels. In contrast, the CLS and proposed estimates maintain low variances at these sample sizes. As the sample size increases to n = 500, the variances of all three models decrease, with the proposed estimates consistently showing the lowest variances, as expected. These findings demonstrate the superior stability and reliability of the proposed methodology in regression modeling while handling missing data, particularly in smaller sample scenarios. Figure 4.2: Line charts for asymptotic variance of intercept (βˆ0 ) and slope (βˆ1 ) when ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 Table 4.2 and 4.3 provides computations of bias, RMSE, 95% CP, and RRMSE for βˆ0 and βˆ1 across different sample sizes and missing proportions. The intercept estimates exhibit significantly higher bias and RMSE in naive models compared to the proposed models, irrespective of sample size and missing proportion. Moreover, both 44 estimates show high CI coverage rates ranging from 90% to 97% for the intercept. However, the naive estimates demonstrate a higher CP, attributed to its higher variance. The relative RMSE values for the CLS estimates and the proposed estimates lie around 1 or less. The relative RMSE between the naive estimates and the proposed estimates exceeds 1, indicating the superior performance of the proposed models over the naive models. Table 4.2: Bias, asymptotic RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and missing proportions 0.1, 0.3 and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.002 -0.019 -0.005 0.738 0.771 0.740 0.97 0.96 0.96 1.00 1.04 0.3 0.002 -0.013 -0.011 0.738 0.887 0.732 0.97 0.94 0.94 1.01 1.21 0.5 0.002 -0.019 -0.023 0.738 0.982 0.717 0.97 0.95 0.93 1.03 1.37 0.1 0.014 0.022 0.015 0.517 0.544 0.517 0.97 0.96 0.96 1.00 1.05 0.3 0.014 0.021 0.014 0.517 0.609 0.513 0.97 0.97 0.95 1.01 1.19 0.5 0.014 0.034 0.027 0.517 0.692 0.504 0.97 0.95 0.89 1.02 1.37 0.1 -0.002 0.002 -0.001 0.229 0.241 0.229 0.94 0.96 0.94 1.00 1.05 0.3 -0.002 0.001 0.000 0.229 0.269 0.227 0.94 0.95 0.91 1.01 1.19 0.5 -0.002 -0.009 -0.006 0.229 0.306 0.224 0.94 0.95 0.92 1.02 1.37 Table 4.3: Bias, asymptotic RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and missing proportions 0.1, 0.3 and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.000 0.002 0.002 0.072 0.076 0.074 0.94 0.95 0.95 0.98 1.03 0.3 0.000 0.003 0.003 0.072 0.088 0.078 0.94 0.95 0.91 0.93 1.13 0.5 0.000 0.002 0.002 0.072 0.098 0.079 0.94 0.94 0.88 0.91 1.23 0.1 -0.001 -0.003 -0.003 0.050 0.053 0.051 0.94 0.95 0.94 0.98 1.04 0.3 -0.001 -0.002 -0.001 0.050 0.059 0.054 0.94 0.94 0.91 0.93 1.10 0.5 -0.001 -0.004 -0.003 0.050 0.068 0.056 0.94 0.95 0.88 0.90 1.22 0.1 0.000 0.001 0.000 0.022 0.022 0.022 0.96 0.96 0.95 1.00 1.00 0.3 0.000 0.001 0.000 0.022 0.027 0.022 0.96 0.94 0.91 1.00 1.18 0.5 0.000 0.000 0.000 0.022 0.030 0.025 0.96 0.93 0.90 0.91 1.22 45 The bias of the slope estimate is approximately similar in both naive and proposed models. In contrast, the RMSE of naive estimates is higher compared to the proposed estimates due to its higher variance. Similar to the coverage probabilities obtained for intercept, the 95% CP of naive slope estimates is higher than the slope estimates of proposed models due to the higher variance. However, the RRMSE of naive and proposed estimates indicates values greater than 1, which shows that the proposed estimates outperform the naive estimates. 4.2.2 Results in the Presence of Right Censored Observations Figure 4.3 illustrate the boxplots of parameter estimates for SLR models with n = 100, ρ = 0.2 for the CLS models, the naive models, and the proposed models in the presence of right censored observations in data. The proposed models show consistent estimates for intercept with less variability around the true value of βˆ0 = −1.3 (indicated by the red dashed line). In contrast, naive models show a significant deviation from the actual value, especially with higher proportions of right censored data. This indicates a high level of bias and variability, and as the proportion of right censored data increases, the variability of the estimates also increases for all three models. However, this increase in variability is more pronounced for the naive models compared to the proposed models. The slope (βˆ1 ) estimates obtained from the naive models lie closer to the actual value than the intercept estimate, but gradually deviate from the actual value with increased right censoring proportion. In addition, the variability of the slope estimates obtained from the naive approach is increasing significantly with extreme outliers for heavy censoring. The boxplots displaying parameter estimates for the BVN distribution for this combination are included in Appendix B. 46 Figure 4.3: Boxplots for estimates of intercept (βˆ0 ) and slope (βˆ1 ), when n = 100 and ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 The asymptotic variances of the intercept (βˆ0 ) and slope (βˆ1 ) estimates for the CLS, naive, and proposed models are depicted in Figure 4.4, in the presence of right censored data. These variances are evaluated across varying proportions of right censored data and for the three different sample sizes. The findings demonstrate that our proposed methodology consistently yields lower and more consistent asymptotic variances for estimating the slope compared to the naive approach, especially as the proportion of right censored data increases. While the naive models exhibit relatively closer variances for the intercept when censoring is light (0.1 and 0.3), they show higher variance under small sample size (50,100) and heavy censoring (0.5). 47 Figure 4.4: Line charts for asymptotic variance of intercept (βˆ0 ) and slope (βˆ1 ) when ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 Tables 4.4 and 4.5 provide computations of bias, RMSE, 95% CP, and relative RMSE for βˆ0 and βˆ1 across different sample sizes and right censoring proportions. The intercept estimates exhibit a significantly higher magnitude of bias and RMSE in naive models compared to the proposed models, irrespective of sample size and proportion of right censoring. Moreover, the CI coverage rate of naive estimates has significantly reduced compared to proposed estimates. The proposed models demonstrate a higher CP ranging from 93% to 97% for the intercept. The relative RMSE of CLS over proposed estimates indicates the better performance of proposed estimates, and the relative RMSE of naive over proposed estimates is greater than 1. This indicates that the proposed approach outperforms the naive approach. 48 Table 4.4: Bias, asymptotic RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.002 -0.533 0.021 0.738 0.924 0.746 0.97 0.91 0.97 0.99 1.24 0.3 0.002 -1.261 0.030 0.738 1.482 0.747 0.97 0.64 0.96 0.99 1.98 0.5 0.002 -1.967 0.035 0.738 2.125 0.757 0.97 0.29 0.95 0.97 2.81 0.1 0.014 -0.444 0.017 0.517 0.691 0.518 0.97 0.89 0.97 1.00 1.33 0.3 0.014 -1.191 0.018 0.517 1.308 0.517 0.97 0.40 0.97 1.00 2.53 0.5 0.014 -1.871 0.009 0.517 1.950 0.522 0.97 0.08 0.97 0.99 3.74 0.1 -0.002 -0.467 -0.001 0.229 0.522 0.229 0.94 0.48 0.94 1.00 2.28 0.3 -0.002 -1.220 -0.003 0.229 1.243 0.229 0.94 0.00 0.94 1.00 5.44 0.5 -0.002 -1.912 -0.013 0.229 1.927 0.231 0.94 0.00 0.93 0.99 8.34 Table 4.5: Bias, asymptotic RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and right censoring proportions 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.000 -0.021 -0.000 0.072 0.077 0.073 0.94 0.93 0.93 0.99 1.05 0.3 0.000 -0.031 -0.003 0.072 0.093 0.074 0.94 0.91 0.94 0.97 1.26 0.5 0.000 -0.039 -0.006 0.072 0.108 0.076 0.94 0.92 0.95 0.95 1.43 0.1 -0.001 -0.018 -0.001 0.050 0.056 0.050 0.94 0.93 0.93 1.00 1.12 0.3 -0.001 -0.029 -0.003 0.050 0.067 0.051 0.94 0.91 0.93 0.98 1.30 0.5 -0.001 -0.035 -0.006 0.050 0.076 0.052 0.94 0.91 0.95 0.96 1.46 0.1 0.000 -0.017 -0.001 0.022 0.028 0.022 0.96 0.90 0.95 1.00 1.25 0.3 0.000 -0.032 -0.003 0.022 0.041 0.023 0.96 0.78 0.96 1.00 1.84 0.5 0.000 -0.041 -0.005 0.022 0.051 0.023 0.96 0.72 0.95 0.97 2.22 49 4.2.3 Results in the Presence of Left Censored Observations Figure 4.5 illustrate the boxplots of parameter estimates for SLR models with n = 100, ρ = 0.2 using the CLS models, the naive models, and the proposed models in the presence of left censored observations in data. The proposed models show consistent estimates for intercept with less variability around the actual value of βˆ0 = −1.3 (indicated by the red dashed line). In contrast, naive models show a significant deviation from the actual value, especially with higher proportions of left censored data, indicating high bias and variability. The slope (βˆ1 ) estimates obtained from the naive models lie closer to the actual value compared to the intercept estimate, but gradually deviate from the actual value with increased left censoring proportion. In addition, the variability of the slope estimates obtained from the naive models increases significantly with extreme outliers for heavy censoring. The boxplots displaying parameter estimates for the BVN distribution for this combination are included in Appendix C. Figure 4.5: Boxplots for estimates of intercept (βˆ0 ) and slope (βˆ1 ), n = 100 and ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 The asymptotic variances of the intercept (βˆ0 ) and slope (βˆ1 ) estimates for the CLS, Naive, and proposed methods are depicted in Figure 4.6, in the presence of left censored data. These variances were evaluated across varying proportions of left 50 censored data and the three sample sizes. The findings demonstrate that our proposed models yield lower and more consistent asymptotic variances in estimating slope and intercept compared to the naive models, especially as the proportion of left censored data increases. While the naive estimates exhibit relatively closer variances when the sample size is high (500), they show higher variance under small sample sizes (50,100) and heavy censoring (0.3, 0.5). Figure 4.6: Line charts for asymptotic variance of intercept (βˆ0 ) and slope (βˆ1 ) when ρ = 0.2 over different left censoring proportions of 0.1, 0.3 and 0.5 and sample sizes 50, 100, and 500 Tables 4.6 and 4.7 provide computations of bias, RMSE, 95% CP, and relative RMSE for βˆ0 and βˆ1 across different sample sizes and left censoring proportions. The intercept estimates exhibit significantly higher bias and RMSE in naive models compared to the proposed models, irrespective of sample size and proportion of left censoring. Moreover, the CI coverage rate of naive estimates has significantly reduced over the left censoring proportion compared to proposed estimates. The proposed models demonstrate a higher CP ranging from 93% to 97% for intercept and slope. The relative RMSE of CLS over proposed estimates indicates the better performance of proposed estimates, and the relative RMSE of naive over proposed estimates is greater than 1. This indicates that the proposed estimates outperform the naive estimates. 51 Table 4.6: Bias, asymptotic RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.002 0.836 -0.008 0.738 1.106 0.743 0.97 0.79 0.97 0.99 1.49 0.3 0.002 1.480 0.012 0.738 1.717 0.749 0.97 0.60 0.96 0.99 2.29 0.5 0.002 2.209 0.030 0.738 2.448 0.754 0.97 0.43 0.96 0.98 3.25 0.1 0.014 0.575 0.018 0.517 0.777 0.517 0.97 0.80 0.97 1.00 1.50 0.3 0.014 1.409 0.043 0.517 1.530 0.522 0.97 0.33 0.96 0.99 2.93 0.5 0.014 2.155 0.065 0.517 2.270 0.528 0.97 0.14 0.95 0.98 4.30 0.1 -0.002 0.560 -0.000 0.229 0.606 0.229 0.94 0.32 0.95 1.00 2.65 0.3 -0.002 1.411 0.017 0.229 1.436 0.232 0.94 0.00 0.94 0.99 6.20 0.5 -0.002 2.167 0.047 0.229 2.189 0.236 0.94 0.00 0.93 0.97 9.27 Table 4.7: Bias, asymptotic RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.000 -0.024 -0.002 0.0721 0.077 0.072 0.94 0.92 0.93 1.00 1.07 0.3 0.000 -0.032 -0.004 0.072 0.092 0.074 0.94 0.91 0.94 0.98 1.25 0.5 0.000 -0.039 -0.007 0.072 0.107 0.075 0.94 0.91 0.95 0.96 1.42 100 0.1 -0.001 -0.019 -0.002 0.050 0.056 0.050 0.94 0.92 0.94 1.00 1.12 0.3 -0.001 -0.031 -0.005 0.050 0.067 0.051 0.94 0.91 0.93 0.98 1.31 0.5 -0.001 -0.039 -0.008 0.050 0.077 0.053 0.94 0.90 0.93 0.95 1.46 500 0.1 0.000 -0.017 -0.000 0.022 0.028 0.022 0.96 0.89 0.96 1.00 1.24 0.3 0.000 -0.032 -0.003 0.022 0.042 0.023 0.96 0.75 0.95 1.00 1.85 0.5 0.000 -0.042 -0.006 0.022 0.050 0.023 0.96 0.69 0.95 0.97 2.18 50 52 4.2.4 Results in the Presence of Interval Censored Observations Figure 4.7 illustrate the boxplots of parameter estimates for SLR models with n = 100, ρ = 0.2 of the CLS models, the naive models, and the proposed models in the presence of interval censored observations in data. The naive and proposed models show the average of the estimates centered around the actual value of βˆ0 = −1.3 (indicated by the red dashed line). However, the naive estimates exhibit a high variability around the actual value, especially with a higher proportion of interval censoring. In contrast, the proposed estimates demonstrate consistent and less variability across all scenarios. The slope (βˆ1 ) estimates obtained from the naive models are closer to the actual value but show a slight deviation when the censoring proportion is 0.5. Additionally, the variability of the slope estimates from the naive models increases significantly with heavy censoring compared to the proposed models. The boxplots displaying parameter estimates for the BVN distribution for this combination are included in Appendix D. Figure 4.7: Boxplots for estimates of intercept (βˆ0 ) and slope (βˆ1 ), when n = 100 and ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 The asymptotic variances of the intercept (βˆ0 ) and slope (βˆ1 ) estimates for the CLS, naive, and proposed models are depicted in Figure 4.8, in the presence of interval censored data. These variances were evaluated across varying proportions of interval censored data and for the three sample sizes. The findings demonstrate that our 53 proposed models yield lower and more consistent asymptotic variances for estimating the parameters than the naive models, especially as the proportion of interval censored data increases. While the naive estimates exhibit relatively closer variances when censoring is light (0.1 and 0.3), they show higher variance under small sample size (50, 100) and heavy censoring (0.5). Figure 4.8: Line charts for asymptotic variance of intercept (βˆ0 ) and slope (βˆ1 ) when ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 Tables 4.8 and 4.9 provide computations of bias, RMSE, 95% CP, and relative RMSE for (βˆ0 ) and (βˆ1 ) across different sample sizes and interval censoring proportions. The estimates exhibit higher bias and RMSE when employing the naive method than the proposed method, irrespective of sample size and proportion of interval censoring. Moreover, the CI coverage rate of the naive method has reduced over the censoring proportion compared to the proposed method. The proposed method demonstrates a higher CP ranging from 93% to 97% for intercept and slope. The relative RMSE of CLS over PM indicates a similar performance between the proposed method and the CLS method, and the relative RMSE of Naive over PM is greater than 1, which indicates that the proposed method outperforms the naive method. 54 Table 4.8: Bias, asymptotic RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.002 -0.019 0.002 0.738 0.781 0.738 0.97 0.95 0.97 1.00 1.06 0.3 0.002 -0.066 -0.002 0.738 0.900 0.737 0.97 0.90 0.96 1.00 1.22 0.5 0.002 -0.130 -0.004 0.738 1.065 0.737 0.97 0.85 0.96 1.00 1.44 0.1 0.014 0.020 0.014 0.517 0.546 0.517 0.97 0.95 0.97 1.00 1.06 0.3 0.014 -0.000 0.017 0.517 0.626 0.517 0.97 0.82 0.97 1.00 1.21 0.5 0.014 0.027 0.013 0.517 0.743 0.517 0.97 0.70 0.97 1.00 1.44 0.1 -0.002 -0.009 -0.002 0.229 0.242 0.229 0.94 0.86 0.94 1.00 1.06 0.3 -0.002 -0.030 0.001 0.229 0.280 0.229 0.94 0.58 0.94 1.00 1.22 0.5 -0.002 -0.060 -0.001 0.229 0.334 0.229 0.94 0.40 0.94 1.00 1.46 Table 4.9: Bias, asymptotic RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) for ρ = 0.2 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.000 0.003 0.000 0.072 0.076 0.072 0.94 0.93 0.93 1.00 1.05 0.3 0.000 0.007 0.000 0.072 0.086 0.072 0.94 0.93 0.93 1.00 1.19 0.5 0.000 0.016 0.001 0.072 0.099 0.073 0.94 0.93 0.93 0.99 1.36 0.1 -0.001 -0.001 -0.001 0.050 0.053 0.050 0.94 0.94 0.94 1.00 1.06 0.3 -0.001 0.002 -0.001 0.050 0.059 0.050 0.94 0.94 0.93 1.00 1.18 0.5 -0.001 0.010 -0.001 0.050 0.068 0.050 0.94 0.93 0.94 1.00 1.36 0.1 0.000 0.001 0.000 0.022 0.022 0.022 0.96 0.95 0.96 1.00 1.00 0.3 0.000 0.006 0.000 0.022 0.027 0.022 0.96 0.92 0.96 1.00 1.21 0.5 0.000 0.013 0.000 0.022 0.033 0.022 0.96 0.87 0.95 1.00 1.46 55 4.2.5 Summary of the Simulation Results Extensive simulation studies reveal that the proposed approach outperforms the naive approach, by providing less biased estimates, reduced standard errors, higher coverage probability, and RRMSE values greater than 1, regardless of correlation, sample size, or the proportion of the incomplete data. 56 Chapter 5 Data Applications This chapter presents the results obtained from applying the proposed approach to two real data applications: the first for missing observations and the second for leftcensored observations. 5.1 Determining the Effect of Solar Radiation on Ozone Concentration–An Application to the Presence of Missing Observations Ozone is a significant atmospheric air pollutant with dual effects on environmental health, air quality, and climate dynamics. Ozone acts as a protective layer in the stratosphere by absorbing the most harmful ultraviolet (UV) radiation released by the sun, specifically UV–B and UV–C rays. This shielding effect is significant in safeguarding living organisms from various serious health issues, including skin cancers, DNA damage, cataracts, and other related health problems. However, Ozone acts as a harmful air pollutant in the troposphere, the lowest layer of the atmosphere. Ground–level Ozone, primarily formed through solar radiation, presents significant health hazards. It impacts respiratory health, leading to asthma, diminished lung capacity, and increased respiratory infections. 57 Numerous researchers have delved into the relationship between Ozone concentration and solar radiation, as evidenced by studies conducted by Jasaitis et al. [2016], Belan and Sklyadneva [1999], Ghalib et al. [2020], and Rivas and Rojas [2018]. Jasaitis et al. [2016] observed a positive linear correlation between solar radiation and Ozone concentration in the atmospheric ground layer. Although they noted that the relationship was not exceptionally strong, they emphasized its statistical significance, which had a significance level of less than 0.5. The air quality dataset in the MICE package for R is utilized to assess the regression models fitted via the proposed method and a naive approach. This dataset comprises 154 daily air quality measurements collected over 154 consecutive days from May to September 1973 in New York City. The dataset integrates Ozone data from the New York State Department of Conservation with meteorological data from the National Weather Service. The dataset consists of 42 observations, with 37 observations missing in Ozone concentration and 7 observations missing in solar radiation, which constitutes a 26.5% missing percentage. The Little’s MCAR test was performed using the naniar package in R to assess whether the missing data follows the MCAR assumption. The test yielded a p–value of 0.0014, which is less than the significance level of 0.05, suggesting the missing data is not MCAR. We assumed the data are MAR, indicating the missing data is related with observed data. The scatter plot in Figure 5.1 illustrates the relationship between Ozone concentration (measured in parts per billion) and solar radiation (measured in Langleys), providing a visual representation of how these two variables of the data interact. 58 Figure 5.1: Scatter plot of Ozone concentration (ppb) over Solar radiation (Lang) Both the normal Q–Q plot and histogram are presented in Figures 5.2 and 5.3 to assess whether the distribution of Ozone concentration and solar radiation follows a normal distribution as it is one of our primary assumptions. It is significant that the Ozone concentrations follow a non–normal distribution, whereas solar radiation approximately follows a normal distribution. Due to the non–normality in Ozone concentrations, a log transformation is applied to approximate normality better. The histogram and normal Q–Q plot for log–transformed Ozone concentration is illustrated in figure 5.4. SLR models are fitted using two distinct approaches: the naive approach, which excludes observations with missing values, and the EM–based proposed approach, which imputes missing values to obtain parameter estimates. The effectiveness of the proposed estimates is assessed by comparing them to the naive estimates. 59 Figure 5.2: Histogram and normal Q–Q plot for solar radiation Figure 5.3: Histogram and normal Q–Q plot for Ozone concentration 60 Figure 5.4: Histogram and normal Q–Q plot for log–transformed Ozone concentration Tables 5.1 display parameter estimates for a SLR model obtained through both the naive (naive model) and proposed (proposed model) approaches. Although the intercept and slope estimates are approximately similar between the two models, notable differences emerge in the standard errors of these estimates. Both the standard errors of the slope and intercept of the proposed model are lower than those of the naive model. Consequently, the CIs obtained from the proposed model are narrower, indicating greater certainty in the estimated parameters. For instance, the ratio 0.6608 0.5692 = 1.16 indicates that the intercept interval of the naive model is 1.16 times larger than that of the proposed model, representing a 16% increase. Similarly, the 0.0032 = 1.14 indicates that the slope interval of the naive model is 1.14 times ratio 0.0028 larger than that of the proposed model, representing a 14% increase. The slightly lower AIC in the proposed model (352.4097) compared to naive model (352.4452) implies a good overall fit of the data. This suggests that the proposed model offers more precise estimates, enhancing confidence in the model’s predictions and parameter interpretations. 61 Table 5.1: Comparison of model coefficients, standard errors, Lower Confidence Limit (LCL) and Upper Confidence Limit (UCL), and interval width of naive model and proposed model Model Parameter Naive model Proposed model 5.2 Estimate SE 95% CI LCL UCL Interval width β0 2.6152 0.1667 2.2849 2.9456 0.6608 β1 0.0043 0.0008 0.0027 0.0059 0.0032 β0 2.6173 0.1452 2.3327 2.9019 0.5692 β1 0.0043 0.0007 0.0030 0.0057 0.0028 Exploring the Relationship Between Blood and Feather Lead Concentrations in Black–Crowned Night Herons (Nycticorax nycticorax)–An Application to the Presence of Left Censored Observations Recent research has demonstrated an increase in the interest of applying noninvasive techniques to monitor environmental contaminant exposure and its subsequent effects on wildlife populations. This focus is particularly relevant for avian species, where heavy metals such as Lead (Pb), Mercury (Hg), Selenium (Se) and Cadmium (Cd) accumulate in feathers and tissues, causing significant health problems [Golden et al., 2003]. Feathers act as airborne particle collectors, providing a valuable method for measuring a bird’s exposure to heavy metals. By analyzing the concentrations of metals in feathers, researchers can gain important insights into the level of heavy metal pollution in the birds’ surrounding environment ([Ellenberg et al., 1985]). Juvenile birds typically stay in the same area where they were born until they are mature enough to migrate. The metal concentrations in their feathers can provide a better picture of the environmental contamination at a specific location. Feathers 62 are advantageous as a monitoring tool because they are easy to collect and store, can be sampled from the same bird over multiple years, and cause minimal harm to the bird. This makes feathers an excellent choice for noninvasive monitoring of heavy metal exposure in avian species. Blood serves as a frequently utilized tissue for monitoring lead exposure in birds. When a bird consumes lead–contaminated food or water, the lead swiftly enters its bloodstream. Monitoring lead levels in the blood clearly indicates recent exposure to lead sources. Feathers grow gradually over time and incorporate substances like lead in the bloodstream. As feathers develop, they accumulate lead circulating in the bloodstream during their formation. Thus, feather lead concentration can indicate recent and historical exposure to lead. Understanding the relationship between lead concentrations in these two tissues can improve the efficacy of monitoring programs and offer valuable insights into the dynamics of lead exposure in herons. Numerous studies have investigated lead contamination in birds, mainly focusing on the relationship between lead concentrations in blood, feathers and other organs such as liver, kidney, and bone. These studies have generally found a positive correlation, suggesting that feathers can serve as reliable indicators of lead exposure in blood. Studies conducted by Golden et al. [2003], Dauwe et al. [2002], Pain et al. [2019] have investigated the association between lead concentrations in blood and feathers of different bird species. Golden et al. [2003] observed a positive correlation between the lead concentrations of feathers and the blood of herons (Nycticorax nycticorax). The association between lead concentrations in feathers and blood is investigated using the publicly available PbHerons dataset from the NADA2 package in R. This dataset originates from a study conducted by Golden et al. [2003], which aimed to determine the relationship between lead concentrations in feathers and various other organs, including the liver, kidney, bone, brain, and blood. This analysis focuses on the lead concentrations in feathers and blood to explore the practical application of the proposed methodology. Herons, such as the Black-crowned Night Heron (Nycticorax nycticorax), are top predators in aquatic ecosystems and are particularly susceptible to the bioaccumulation of lead from contaminated water bodies and prey. The dataset comprises 27 observations. Among these, 15 observations have left- 63 Figure 5.5: Scatter plot of lead concentration in blood and feather of Herons censored lead concentration values in blood at 0.06 and 0.07 µg/g, accounting for 55.5% of the dataset. Additionally, there are 2 left censored values for lead concentration in feathers at 0.02 µg/g, representing 7.4% of the total observations. It is important to note that the two left censored feather lead concentration observations are also left-censored for blood lead concentrations. The scatter plot showing the relationship between blood and feather lead concentrations is presented in Figure 5.5. Table 5.2 presents the parameter estimates for three different models: the naive model, which ignored censored observations; the alternate model, which considered complete observations by taking the threshold values in x and y as observed values; and the EM–based proposed model. A comparison of the intercept and slope coefficients among the three models reveals notable differences. In particular, the standard errors for the slope and intercept are highest in the naive model, followed by the alternate model, and lowest in the proposed model. According to the CIs for the intercept and slope estimates of the naive model, zero lies within the respective intervals. Therefore, we fail to reject the null hypothesis, indicating that there is no enough evidence to conclude that βˆ0 and βˆ1 are significantly 64 different from zero at the 95% confidence level. However, zero does not lie within the respective CIs for the alternate and proposed models. This implies the rejection of the null hypothesis. Therefore, there is sufficient evidence to conclude that the intercept and slope estimates of the alternate and proposed models are significantly different from zero at the 95% confidence level. In addition, the proposed model has narrow CIs for both β0 and β1 compared to the alternate model. This suggests the better fit and greater statistical significance of the proposed model for these parameters. The AIC is slightly lower in the proposed model (91.5285), followed by the alternate model (91.9588), and highest in the naive model (92.0023). The lower AIC in the proposed model indicates a better balance between model complexity and goodness of fit. These findings collectively indicate that the proposed model outperforms both the naive and alternate models across multiple evaluation criteria. It demonstrates superior explanatory power, better model fit, and higher predictive accuracy, making it a more robust choice for analyzing and predicting lead concentration in feathers using lead concentration in blood compared to the other models evaluated. Table 5.2: Comparison of model coefficients, standard errors, Lower Confidence Limit (LCL) and Upper Confidence Limit (UCL), and interval width of naive model, alternate model, and proposed model Model Naive model Alternate model Proposed model Parameter Estimate SE 95% CI LCL UCL Interval width β0 -0.5848 0.5337 -1.7739 0.6042 2.3781 β1 6.8052 4.7582 -3.7968 17.4071 21.2039 β0 -1.4455 0.3061 -2.0760 -0.8150 1.2610 β1 -0.1558 4.0153 2.9473 19.4866 16.5393 β0 -0.9809 0.2352 -1.4419 -0.5199 0.9220 β1 8.8309 2.6283 3.6794 13.9824 10.3031 The SLR model expressed as y = −0.9809 + 8.8309x, where y represents the lead concentration in feather, and x denotes the lead concentration in blood, offers valuable insights into the relationship between these two variables. This equation suggests that for every unit increase in the lead concentration in blood (x), there is an esti65 mated increase of 8.8309 µg/g in the lead concentration in feathers (y). The intercept of -0.9809 indicates the estimated lead concentration in feathers when the lead concentration in blood is zero, however it is impossible that for the lead concentration in blood to be exactly zero due to environmental exposures. To ensure the model is correctly specified, a further investigation into data accuracy and identification of outliers is necessary. 5.3 Summary of the Data Applications Based on the two real data applications, the proposed models provided better model fitting compared to the naive models in terms of standard errors and widths of the CIs. This implies the efficiency and superior consistency of the proposed approach relative to the existing naive approach. 66 Chapter 6 Discussion In this thesis, we explored the regression parameters in the presence of left censored, right censored, interval censored, and missing observations based on the EM algorithm. We reviewed previous studies on handling incomplete data across different distributions, identified existing research gaps, and justified the importance of our study. We presented the theoretical assumptions and properties and explained how the EM algorithm is adapted in regression modeling to address various types of censored data and missing values. We conducted simulation studies to evaluate the performance of the proposed EM–based methodology under various scenarios of censoring and missing data. We presented the design of the simulations, the parameters used, and the simulation results. Our analysis compared the performance of a naive approach with the proposed methodology, evaluating key metrics such as asymptotic variance, asymptotic RMSE, CP, and RRMSE. We applied the proposed methodology to real–world datasets. The findings revealed that the proposed method consistently provided more accurate and less biased estimates for regression parameters across all types of censored and missing data scenarios. This method demonstrated superior consistency and efficiency, which is crucial for reliable inference in practical applications. The asymptotic variance of the estimates was significantly lower with the proposed method, indicating greater precision in the parameter estimates. This reduction in variance was observed consistently across interval censored, left censored, right censored, and missing ob67 servations, highlighting the robustness of the proposed approach. Furthermore, the proposed approach yielded lower asymptotic RMSE values than the naive approach, suggesting improved predictive performance and reliability. The reduction in RMSE was particularly notable in scenarios with high censoring or missing data levels, where the naive approach often needed help to provide accurate estimates. Regarding CP, the proposed method achieved consistently higher values, reflecting its effectiveness in providing accurate interval estimates and capturing true uncertainty in the parameter estimates. Additionally, the proposed method exhibited significantly lower RRMSE values across all types of censored and missing data scenarios, indicating its superior reliability and robustness in dealing with complex data structures and providing more accurate and stable estimates. These findings have important implications for statistical modeling in the presence of censored and missing data. The proposed method’s ability to outperform the naive method in terms of parameter accuracy, precision, and reliability makes it a valuable tool for researchers and practitioners in various fields, including medical research, environmental studies, and social sciences. 6.1 Limitations of the Study Despite the promising simulation and real–data results of this study, it’s crucial to identify the potential limitations. The high computing power and time can make it impractical or difficult to use the algorithm effectively in specific real-world situations where such extensive resources might not be readily available. Moreover, the performance of the algorithm might vary significantly depending on the initial parameters. Inappropriate initial parameters can cause the algorithm to converge to a local maximum rather than the global maximum. Therefore, selecting the initial parameters might require running the algorithm multiple times with different initial points, which demands additional computational time and effort. Another limitation is the unexplored performance of the EM algorithm in high–dimensional settings, which could restrict its applicability in cases involving large numbers of predictors. 68 6.2 Future Work Future research should focus on several key areas to address these limitations. Comparing the proposed method with other imputation techniques such as mean imputation and regression imputation would allow to benchmark the performance of the proposed approach with existing techniques. Additionally, applying multiple imputation to the proposed approach could be valuable. This would involve generating multiple models from the same dataset by using different initial parameters during the initialization step of the EM algorithm. Additionally, developing robust versions of the EM algorithm that are less sensitive to violations of model assumptions would increase its applicability. Extending the proposed methodology to handle high–dimensional data would also represent a significant advancement. Moreover, investigations into the placement of censoring thresholds and the width of intervals for interval censoring should be conducted. Understanding how close these intervals are to the mean can provide insights into the accuracy and reliability of the estimates. Applying the proposed methodology to a broader range of real–world applications, including diverse types of data and various fields of study, would actively engage the community in validating and refining the approach. Finally, creating userfriendly software packages that implement the proposed methodology for regression modeling would facilitate adoption by practitioners and researchers, making the techniques developed in this study more accessible and widely used. 69 Bibliography Allison, P. D. (2012). Handling missing data by maximum likelihood. Amini, M. and Ahmadi, J. (2013). Asymptotic efficiencies of the MLE based on bivariate record values from bivariate normal distribution. JIRSS, 12(2):235–252. Department of Statistics, School of Mathematics, Statistics and Computer Science, College of Science, University of Tehran, Iran & Department of Statistics, Ferdowsi University of Mashhad, Iran. Annas, S., Kartikasari, P., and Arisandi, R. (2021). Handling incomplete data with regression imputation. Journal of Physics, 1752(1). Atkinson, S. E. (1992). The performance of standard and hybrid em algorithms for ml estimates of the normal mixture model with censoring. Journal of Statistical Computation and Simulation, 44(1-2):105–115. Austin, P. C. and Hoch, J. S. (2004). Estimating linear regression models in the presence of a censored independent variable. Statistics in Medicine, 23(3):411–429. Austin, P. C., White, I. R., Lee, D. S., and Van Buuren, S. (2021). Missing data in clinical research: A tutorial on multiple imputation. Canadian Journal of Cardiology, 37(9):1322–1331. Balakrishnan, N. and Cohen, A. (1991). Order Statistics and Inference: Estimation Methods. Academic Press. Belan, B. and Sklyadneva, T. (1999). Influence of solar radiation on the variation of ozone concentration in the ground atmospheric layer. Proceedings of SPIE - The International Society for Optical Engineering. 70 Buckley, J. and James, I. (1979). Linear regression with censored data. Biometrika, 66(3):429–436. Burkardt, J. (2023). The truncated normal distribution. Canavire-Bacarreza, G., Rios-Avila, F., Sacco-Capurro, F., and Bank, T. W. (2023). Recovering income distribution in the presence of interval-censored data. IZA Discussion Paper Series IZA DP No. 15921, IZA Institute of Labor Economics. Chacko, M. and Mathew, S. (2020). Inference on p(x < y) for bivariate normal distribution based on censored data. Journal of the Indian Society for Probability and Statistics, 21(2):487–509. Cohen, J., Cohen, P., West, S., and Aiken, L. (2003). Applied multiple regres- sion/correlation analysis for the behavioural sciences. Lawrence Earlbaum Associates, 3rd edition. Dauwe, T., Bervoets, L., Blust, R., and Eens, M. (2002). Tissue levels of lead in experimentally exposed zebra finches ( taeniopygia guttata ) with particular attention on the use of feathers as biomonitors. Archives of environmental contamination and toxicology, 42:88–92. De Lima Taga, M. F. and Singer, J. M. (2016). Simple linear regression with interval censored dependent and independent variables. Statistical Methods in Medical Research, 27(1):198–207. Dempster, A. P., Laird, N. M., and Rubin, D. B. (1977). Maximum likelihood from incomplete data via the em algorithm. J R Stat Soc Series. Dube, S., Pradhan, B., and Kundu, D. (2011). Parameter estimation of the hybrid censored log-normal distribution. Journal of Statistical Computation and Simulation, 81(3):275–287. Ellenberg, H., Dietrich, J., Stoeppler, M., and Nürnberg, H. (1985). Environmental monitoring of heavy metals with birds as pollution integrating biomonitors: introduction, definitions and practical examples for goshawk (accipiter gentilis). In Lekkas, T., editor, Heavy metals in the environment, pages 724–726. CEP Consultants Ltd., Athens. 71 Ferreira, L. A. and Silva, J. L. (2017). Parameter estimation for weibull distribution with right censored data using em algorithm. Eksploatacja I Niezawodnosc - Maintenance and Reliability, 19(2):310–315. Ghalib, W., Al-Taai, O., and Abbood, Z. (2020). The influence of solar radiation on ozone column weight over baghdad city. volume 928. Golden, N., Rattner, B., Cohen, J., and Ottinger, M. (2003). Lead accumulation in feathers of nestling black-crowned night herons (nycticorax nycticorax). Environmental Toxicology and Chemistry, 22:1517–1524. ResearchGate. Gustavsson, S. (2012). Linear maximum likelihood regression analysis for untransformed log-normally distributed data. Open Journal of Statistics. Guzmán, D. C. F., Ferreira, C. S., and Zeller, C. B. (2020). Linear censored regression models with skew scale mixtures of normal distributions. Journal of Applied Statistics, 48(16):3060–3085. Haitovsky, Y. (1968). Missing data in regression analysis. Journal of the Royal Statistical Society Series B-methodological, 30(1):67–82. Hardt, J., Herke, M., and Leonhart, R. (2012). Auxiliary variables in multiple imputation in regression with missing x: a warning against including too many in small sample research. BMC Medical Research Methodology, 12(1). Hussain, E. A., Al-Shallawi, A. N. S., and Saied, H. A. (2022). Using maximum likelihood method to estimate parameters of the linear regression t truncated model. NTU Journal of Pure Sciences, pages 2789–1097. Jadhav, A., Pramod, D., and Ramanathan, K. (2019). Comparison of performance of data imputation methods for numeric dataset. Applied Artificial Intelligence, 33(10):913–933. Jasaitis, D., Vasiliauskienė, V., Chadyšienė, R., and Pečiulienė, M. (2016). Surface ozone concentration and its relationship with uv radiation, meteorological parameters and radon on the eastern coast of the baltic sea. Atmosphere, 7(2):27. Josse, J., Jiang, W., Sportisse, A., and Robin, G. (2018). Handling missing values. 72 Kang, H. (2013). The prevention and handling of the missing data. Korean Journal of Anesthesiology, 64(5):402–406. Kayid, M. and Al-Maflehi, N. (2022). Em algorithm for estimating the parameters of quasi-lindley model with application. Journal of Mathematics, pages 1–9. Kim, H.-Y. (2019). Statistical notes for clinical researchers: simple linear regression 3 – residual analysis. Restorative Dentistry Endodontics, 44:116–118. Koul, H., Susarla, V., and Van Ryzin, J. (1981). Regression analysis with randomly right-censored data. The Annals of Statistics, 9(6):1276–1288. Lee, G. and Scott, C. (2012). Em algorithms for multivariate gaussian mixture models with truncated and censored data. Computational Statistics & Data Analysis, 56(9):2816–2829. Little, R. J. A. and Rubin, D. B. (2002). Statistical Analysis with Missing Data. Wiley series in probability and statistics. Mamat, A. R., Mohamed, M. A., Nasruddin, M. H., Awang, M. K., and Mohamed, F. S. (2019). Least square method technique for predicting the acquisition of raw materials and sales of crisp for small and medium enterprises. International Journal of Recent Technology and Engineering (IJRTE), 7(5S4):612–613. Maulud, D. H. and Abdulazeez, A. M. (2020). A review on linear regression comprehensive in machine learning. Journal of Applied Science and Technology Trends, pages 140–147. McLachlan, G. J. and Jones, P. N. (1988). Fitting mixture models to grouped and truncated data via the EM algorithm. Biometrics, 44(2):571–578. McLachlan, G. J., Krishnan, T., and Ng, S. K. (2004). The em algorithm. HumboldtUniversität zu Berlin, Center for Applied Statistics and Economics (CASE). Miller, R. G. (1976). Least squares regression with censored data. Biometrika, 63(3):449–464. 73 Mouret, F., Hippert-Ferrer, A., Pascal, F., and Tourneret, J. (2023). A robust and flexible em algorithm for mixtures of elliptical distributions with missing data. IEEE Transactions on Signal Processing, 71:1669–1682. Ng, H. K. T., Chan, P. S., and Balakrishnan, N. (2002). Estimation of parameters from progressively censored data using em algorithm. Computational Statistics & Data Analysis, 39(4):371–386. Ng, S. K., Krishnan, T., and McLachlan, G. J. (2011). The EM algorithm, pages 139–172. Orchard, T. and Woodbury, M. A. (1972). A missing information principle: Theory and applications. In Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Theory of Statistics, volume 6, pages 697–716. University of California Press. Pain, D., Mateo, R., and Green, R. (2019). Effects of lead from ammunition on birds and other wildlife: A review and update. Ambio, 48(9):935–953. PMID: 30879267; PMCID: PMC6675766. Park, C. and Lee, S. B. (2012). Parameter estimation from censored samples using the expectation-maximization algorithm. R Core Team (2022). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. Reynolds, W. M. and Sundberg, N. D. (1976). Recent research trends in testing. Journal of Personality Assessment, 40(3):228–233. Ritov, Y. (1990). Estimation in a linear regression model with censored data. The Annals of Statistics, 18(1):303–328. Accessed 13 Aug. 2024. Rivas, M. and Rojas, E. (2018). Effects of ozone layer variation in ultraviolet solar radiation level received at ground in arica north of chile. Journal of Physics: Conference Series, 1043:012066. Roussas, G. (2003). Introduction to Probability and Statistical Inference. Elsevier Science. 74 Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3):581. Rueda, J. N. and Garcia, A. C. (2018). Process control with highly left censored data. Shylaja, B. and Kumar, S. (2018). Traditional versus modern missing data handling techniques: An overview. International Journal of Pure and Applied Mathematics, 118(14):1314–3395. Soch, J. et al. (2023). Statproofbook 2022 (version 2022). Steiner, S. H. and Mackay, R. J. (2000). Monitoring processes with highly censored data. Journal of Quality Technology, 32(3):199–208. Strike, K., El Emam, K., and Madhavji, N. (2001). Software cost estimation with incomplete data. IEEE Transactions on Software Engineering, 27(10):890–908. 75 Appendices Appendix A EM Algorithm for Different Censoring Types A.1 EM Algorithm for Right Censored Observations This section explains the computation of the E–step and the M–step when the data contain right censored observations. A.1.1 E–Step The E–step of the EM algorithm requires computing the expected complete data log–likelihood conditioned on the observed data and the current parameter estimates. Incorporating the censoring mechanism for right censored data is also crucial in this computation. Let (x, y) be the two BVN random variables, θ be the parameter vecrc tor, x be right censored at crc x , and y be right censored at cy . The right censored observations are assumed to be present in the x and y of the dataset according to 4 scenarios, where “o” denotes observed data and “rc” denotes right censored data as shown in the Table 3.3. 77 Scenario Index x y 1 1, . . . , m xoi yio 2 m + 1, . . . , p xoi yirc 3 p + 1, . . . , q xrc i yio 4 q + 1, . . . , n xrc i yirc Table A.1: Scenarios with observed and right-censored data for x and y Scenario 1 : xi observed, yi observed The E–step of the EM algorithm for right censored data, when observations are observed for both xi and yi , is the same as scenario 1 of the missing observations. Therefore, it can be computed using the equations given in 3.15–3.19. Scenario 2 : xi observed, yi right censored The E–step involves computing the sufficient statistics expressed in equations 3.2–3.6 when xi is observed while yi is right censored in a set of observations. The expected value of xi , when it is observed, is xi itself. However, the expected values of yirc need to be calculated conditioned on the value of xi , censoring mechanism of yirc < crc y , where crc y is the threshold and current parameter estimates. The computation of the E–step is expressed as p X Eθ(k) ! xoi xoi , yio = i=m+1 p X Eθ(k) = i=m+1 Eθ(k) p X m X m X ! x2i (o) xoi , yio Eθ(k) i=1 (A.1) yi2 (rc) xoi , yirc = p X x2i (o) p X (k) E(yi2 (rc) |xoi = x, yirc > crc ) y ,θ (A.2) i=m+1 ! xoi yirc xoi , yirc = i=m+1 ! i=1 m X (k) ) E(yirc |xoi = x, yirc > crc y ,θ i=m+1 i=1 Eθ(k) xoi i=m+1 ! yirc xoi , yirc p X = p X (k) xoi · E(yirc |xoi = x, yirc > crc ). y ,θ i=m+1 78 (k) (k) E(yirc |xoi = x, yirc > crc ) and E(yi2 (rc) |xoi = x, yirc > crc ) in equations A.1 and y ,θ y ,θ A.2 can be computed by considering the conditional distribution of BVN distribution and right censored normal distribution. The conditional mean and variance when y is conditioned on x can be obtained using (k) (k) (k) µy|x = E(yi |xoi = x, θ(k) ) = µ(k) y +ρ σy (k) σx (xoi − µ(k) x ) 2(k) σy|x = Var(yi | xoi = x, θ(k) ) = σy2(k) (1 − ρ2(k) ) (k) where αy = (crc y −µy|x ) (k) σy|x (k) (A.3) (A.4) (k) . µy|x and σy|x are the mean and standard deviation of yirc given xoi = x at the k th iteration, respectively. The conditional mean and variance conditioned on both x and y can be obtained by incorporating A.3 and A.4 as   (k) 2(k) (k) (k) E yirc | xoi = x, yirc > crc , µ , σ = µy|x + σy|x y y|x y|x Var  (k) 2 (k) yirc | xoi = x, yirc > crc y , µy|x , σy|x  2 (k) = σy|x  − ϕ(αy ) 1 − Φ(αy )  1 + αy ϕ(αy ) 1 − Φ(αy ) ϕ(αy ) 1 − Φ(αy ) 2 #     2 (k) (k) 2 (k) rc o rc rc 2 (k) E yi2 (rc) | xoi = x, yirc > crc , µ , σ = E y | x = x, y > c , µ , σ y y|x i i i y y|x y|x y|x   (k) 2 (k) , σ . + Var yirc | xoi = x, yirc > crc , µ y|x y y|x Scenario 3 : xi right censored, yi observed This is similar to scenario 2, and sufficient statistics need to be computed when xi is right censored while yi is observed in a set of observations. The expected values of yi , when it is observed, is yi itself, and the right censored xi is replaced with the o rc rc expected values of xrc i conditioned on the value of yi and xi > cx . The computation of the E–step for the third scenario includes the formulas ! q q X X o rc rc (k) rc rc o ) E(xrc Eθ(k) xi xi , yi = i |yi = y, xi > cx , θ (A.5) i=p+1 i=p+1 Eθ(k) q X i=p+1 ! o yio xrc i , yi = q X yio i=p+1 79 q X Eθ(k) ! o x2i (rc) xrc i , yi = q X rc (k) E(x2i (rc) |yio = y, xrc ) i > cx , θ Eθ(k) q X ! o yi2 (o) xrc i , yi q X ! o rc o xrc i yi xi , yi = q X yi2 (o) i=p+1 i=p+1 Eθ(k) (A.6) i=p+1 i=p+1 = q X rc (k) rc o ). yio · E(xrc i |yi = y, xi > cx , θ i=p+1 i=p+1 o rc rc (k) rc (k) E(xrc ) and E(x2i (rc) |yio = y, xrc ) in equations A.5 and i |yi = y, xi > cx , θ i > cx , θ A.6 can be computed for each right censored observation, similar to scenario 2. The conditional mean and variance can be obtained using (k) (k) o (k) (k) (k) σx µx|y = E(xi |yi = y, θ ) = µx + ρ (y o − µ(k) y ) (k) i σy (A.7) 2 (k) = V ar(xi |yio = y, θ(k) ) = σx(k) 2 (1 − ρ(k)2 ) σx|y (A.8) (k) where αx = (crc x −µx|y ) (k) σx|y ) th yio = y at the k (k) (k) . µx|y and σx|y are the mean and standard deviation of xrc i given iteration, respectively. The conditional mean and variance given both x and y can be obtained by incorporating A.7 and A.8. (k) (k) (k) o rc rc 2 (k) E(xrc ) = µx|y + σx|y i |yi = y, xi > cx , µx|y , σx|y ϕ(αx ) 1 − Φ(αx )    (k) rc o rc rc 2 (k) 2 (k) Var xi | yi = y, xi > cx , µx|y , σx|y = σx|y 1 + αx  − (k) rc 2 (k) E(x2i (rc) |yio = y, xrc )= i > cx , µx|y , σx|y  ϕ(αx ) 1 − Φ(αx ) ϕ(αx ) 1 − Φ(αx ) 2 # 2 (k) o rc rc 2 (k) E(xrc ) i |yi = y, xi > cx , µx|y , σx|y (k) o rc rc 2 (k) + V ar(xrc ) i |yi = y, xi > cx , µx|y , σx|y Scenario 4 : xi right censored, yi right censored The expected values of xi and yi cannot be calculated directly due to the right censored observations in both variables. The expected value of xi 6rc and expected value of yirc rc rc rc rc rc needed to be calculated conditioned on xrc i > cx and yi > cy in which cx and cy 80 are thresholds of x and y respectively. The computation of the E–step includes the formulas n X Eθ(k) ! rc rc xrc i xi , yi i=q+1 Eθ(k) n X Eθ(k) ! rc yirc xrc i , yi n X Eθ(k) rc rc rc (k) E(yirc |xrc ) i > cx , yi > cy , θ i=q+1 ! rc x2i (rc) xrc i , yi n X = rc rc rc (k) E(x2i (rc) |xrc ) i > cx , yi > cy , θ i=q+1 ! rc yi2 (rc) xrc i , yi n X = rc rc rc (k) E(yi2 (rc) |xrc ) i > cx , yi > cy , θ i=q+1 i=q+1 n X n X = i=q+1 Eθ(k) rc rc rc rc (k) E(xrc ) i |xi > cx , yi > cy , θ i=q+1 i=q+1 n X n X = ! rc rc rc xrc i yi xi , yi = i=q+1 n X rc rc rc rc (k) E xrc i | xi > cx , yi > cy , θ  i=q+1  rc rc rc (k) · E yirc | xrc . i > cx , yi > cy , θ The first moments of the BVN data with both variables being right censored can be obtained using the integral formulas R∞R∞ rc rc rc rc E(xrc i |xi > cx , yi > cy ) = R∞R∞ xfx,y (x, y) dy dx crc crc x y R R = ∞ ∞ rc rc rc P (xrc f (x, y) dy dx i > cx , yi > cy ) crc crc x,y crc x crc y xfx,y (x, y) dy dx x R∞R∞ rc rc rc E(yirc |xrc i > cx , yi > cy ) = yfx,y (x, y) dy dx crc y = rc rc rc P (xrc i > cx , yi > cy ) y R∞R∞ crc x yfx,y (x, y) dy dx crc crc R x∞ R y∞ . f (x, y) dy dx x,y rc rc c c x y Similarly, the second moments of xi and yi can be computed using R∞R∞ 2 R∞R∞ 2 x f (x, y) dy dx x fx,y (x, y) dy dx x,y rc rc cx cy crc crc x y rc rc rc R R = E(x2i (rc) |xrc ∞ ∞ i > cx , yi > cy ) = rc rc rc P (xrc f (x, y) dy dx i > cx , yi > cy ) crc crc x,y x R∞R∞ rc rc rc E(yi2 (rc) |xrc i > cx , yi > cy ) = y 2 fx,y (x, y) dy dx crc y = rc rc rc P (xrc i > cx , yi > cy ) crc x y R∞R∞ y 2 fx,y (x, y) dy dx crc crc x y R∞R∞ fx,y (x, y) dy dx crc crc x y where fx,y (x, y) is the joint PDFof x and y. The adaptIntegrate function in cubature package of R software [R Core Team, 2022] is used to numerically evaluate the integrals due to its complexity to obtain a closed–form solution. 81 Computing the Total Expectation for Right Censored Data The total expectations in eq: 3.2 and 3.6 are obtained by summing up the expected values obtained from all 4 scenarios. n X (k) s1 = Eθ(k) ! rc xi xoi , yio , xrc i , yi m X = xoi + X + o rc rc (k) E(xrc ) i |yi = y, xi > cx , θ i=p+1 n X + xoi i=m+1 i=1 q i=1 p X (A.9) rc rc rc rc (k) E(xrc ) i |xi > cx , yi > cy , θ i=q+1 (k) s2 = Eθ(k) n X ! rc yi xoi , yio , xrc i , yi = m X yio i=1 p i=1 X + (k) E(yirc | xoi = x, yirc > crc ) y ,θ (A.10) i=m+1 q X + + yio i=p+1 n X rc rc rc (k) E(yirc | xrc ) i > cx , yi > cy , θ i=q+1 s21 (k) = Eθ(k) n X ! rc x2i xoi , yio , xrc i , yi m X = i=1 i=1 q X + i=p+1 n X + p X x2i (o) + x2i (o) i=m+1 rc (k) E(x2i (rc) |yio = y, xrc ) i > cx , θ rc rc rc (k) E(x2i (rc) |xrc ) i > cx , yi > cy , θ i=q+1 (A.11) s22 (k) = Eθ(k) n X ! rc yi2 xoi , yio , xrc i , yi = m X yi2 (o) + i=1 q i=1 + + X i=p+1 n X p X (k) ) E(yi2 (rc) |xoi = x, yirc > crc y ,θ i=m+1 yi2 (o) rc rc rc (k) E(yi2 (rc) |xrc ) i > cx , yi > cy , θ i=q+1 (A.12) 82 (k) s12 = Eθ(k) n X ! rc xi yi xoi , yio , xrc i , yi = i=1 m X xoi · yio i=1 p + X (k) xoi · E(yirc |xoi = x, yirc > crc ) y ,θ i=m+1 q + + X i=p+1 n X o rc rc (k) yio · E(xrc ) i |yi = y, xi > cx , θ rc (k) rc rc rc ) E(xrc i |xi > cx , yi > cy , θ i=q+1 rc rc rc (k) · E(yirc |xrc ) i > cx , yi > cy , θ (A.13) A.1.2 M–step M–step is the next step of the algorithm and the sufficient statistics calculated in the E–step equations A.9–A.13 are maximised. The parameters are then updated accordingly. The formulas for MLEs of complete data are obtained from formulas 2.8–2.11 in section 2.7. A.2 EM Algorithm for Left Censored Observations This section explains the computation of the E–step and the M–step when the data contain left censored observations. A.2.1 E–step The E–step of the EM algorithm requires computing the expected complete data log–likelihood conditioned on the observed data and the current parameter estimates. Incorporating the censoring mechanism for left censored data is also crucial in this computation. Let (x, y) be the two BVN random variables, θ be the parameter vector, lc clc x be the threshold of x, and cy be the threshold of y. The left censored data is assumed to be present in the dataset, and “o” denotes the observed data and “lc” 83 denotes the left censored data as shown in Table A.2. Scenario Index x y 1 1, · · · , m xoi yio 2 m + 1, · · · , p xoi yilc 3 p + 1, · · · , q xlc i yio 4 q + 1, · · · , n xlc i yilc Table A.2: Scenarios with observed and left censored data for x and y Scenario 1 : xi observed, yi observed The E–step of the EM algorithm for left censored data when observations are observed for both xi and yi are same as the scenario 1 of the missing observations. Therefore, it can be computed using the equations given in 3.15–3.19. Scenario 2 : xi observed, yi left censored The E–step involves computing the sufficient statistics expressed in equations 3.2– 3.6 when xi is observed while yi is left censored at clc y in a set of observations. The expected values of xi , when it is observed, is xi itself. However, the expected values of yilc must be calculated conditioned on the value of xoi , censoring mechanism of yilc < clc y, and current parameter estimates. The computation of the E–step is expressed as ! p p X X o o lc xoi Eθ(k) xi xi , yi = i=m+1 i=m+1 Eθ(k) p X ! yilc xoi , yilc = i=m+1 p X m X ! x2i (o) xoi , yilc i=1 Eθ(k) i=1 (A.14) i=m+1 Eθ(k) m X (k) ) E(yilc |xoi = x, yilc < clc y ,θ ! yi2 (lc) xoi , yilc = = p X x2i (o) i=m+1 p X (k) ) E(yi2 (lc) |xoi = x, yilc < clc y ,θ (A.15) i=m+1 84 Eθ(k) m X p X ! xoi yilc xoi , yilc = (k) xoi · E(yilc |xoi = x, yilc < clc ). y ,θ i=m+1 i=1 (k) (k) E(yilc |xoi = x, yilc < clc ) and E(yi2 (lc) |xoi = x, yilc < clc ) in equations A.14 y ,θ y ,θ and A.15 can be computed for each left censored observation using the conditional distribution of BVN distribution and left censored normal distribution. Suppose γy = (k) (clc y −µy|x ) (k) σy|x (k) (k) , the conditional mean µy|x and variance σy|x at k th iteration can be obtained using the equations A.3 and A.4. (k) (k) ϕ(γy ) (k) 2 (k) E(yilc |xoi = x, yilc < clc ) = µy|x − σy|x y , µy|x , σy|x Φ(γy ) "  2 # ϕ(γy ) ϕ(γy ) (k) 2 (k) 2 (k) lc lc lc o 1 − γy V ar(yi |xi = x, yi < cy , µy|x , σy|x ) = σy|x − Φ(γy ) Φ(γy ) 2  (k) (k) 2 (k) lc o lc lc 2 (k) , σ ) E(y |x = x, y < c , µ , σ ) = E(yi2 (lc) |xoi = x, yilc < clc , µ y|x i i i y y|x y y|x y|x (k) 2 (k) ) + V ar(yilc |xoi = x, yilc < clc y , µy|x , σy|x Scenario 3 : xi left censored, yi observed This is similar to scenario 2 and the sufficient statistics need to compute when xi is left censored at clc x while yi is observed in a set of observations. The expected values of yi , when it is observed, is yi itself and the left censored xi is replaced with the o lc lc expected values of xlc i conditioned on the value of yi and xi < cx . The computation of the E–step includes the formulas ! q q X X lc lc o o lc lc (k) Eθ(k) xi xi , yi = E(xlc ) i |yi = y, xi < cx , θ i=p+1 q X Eθ(k) ! o yio xlc i , yi = q X ! o x2i (lc) xlc i , yi = i=p+1 q X q X lc (k) E(x2i (lc) |yio = y, xlc ) i < cx , θ ! o yi2 (o) xlc i , yi i=p+1 q X i=p+1 yio (A.17) i=p+1 Eθ(k) Eθ(k) q X i=p+1 i=p+1 Eθ(k) (A.16) i=p+1 q X yi2 (o) i=p+1 q ! o lc o xlc i yi xi , yi = = X o lc lc (k) yio · E(xlc ). i |yi = y, xi < cx , θ i=p+1 85 o lc lc (k) lc (k) E(xlc ) and E(x2i (lc) |yio = y, xlc ) in equations A.16 and i |yi = y, xi < cx , θ i < cx , θ (k) A.17 can be computed similar to scenario 2. Suppose γx = (clc x −µx|y ) σx|y (k) (k) and µx|y and (k) σx|y are the mean and standard deviation of xi , given yio = y at the k th iteration, respectively. It can be calculated using the equations A.3 and A.4. (k) (k) ϕ(γx ) (k) o lc lc 2 (k) E(xlc ) = µx|y − σx|y i |yi = y, xi < cx , µx|y , σx|y " (k) 2 (k) 2 (k) lc lc o 1 − γx ) = σx|y V ar(xlc i |yi = y, xi < cx , µx|y , σx|y Φ(γx ) ϕ(γx ) − Φ(γx )  ϕ(γx ) Φ(γx ) 2 # Scenario 4 : xi left censored, yi left censored The expected values of xi and yi cannot be calculated directly due to the left censored lc observations in both variables. The expected value of xlc i and yi need to be computed lc lc lc lc lc conditionally on xlc i < cx and yi < cy , where cx and cy are thresholds of x and y, respectively. The computation of the E–step includes the formulas ! n n X X lc lc lc lc lc lc lc (k) Eθ(k) xi xi , yi = E(xlc ) i |xi < cx , yi < cy , θ i=q+1 Eθ(k) n X i=q+1 ! lc yilc xlc i , yi = i=q+1 Eθ(k) n X n X lc x2i (lc) xlc i , yi = Eθ(k) n X lc lc lc (k) E(x2i (lc) |xlc ) i < cx , yi < cy , θ i=q+1 ! lc yi2 (lc) xlc i , yi = n X lc lc lc (k) ) E(yi2 (lc) |xlc i < cx , yi < cy , θ i=q+1 i=q+1 n X lc lc lc (k) E(yilc |xlc ) i < cx , yi < cy , θ i=q+1 ! i=q+1 Eθ(k) n X ! lc lc lc xlc i yi xi , yi = i=q+1 n X lc lc lc lc (k) E(xlc ) i | xi < cx , yi < cy , θ i=q+1 lc lc lc (k) · E(yilc | xlc ). i < cx , yi < cy , θ The first moments of the BVN data with both variables being left censored can be obtained using the integral formulas R clcx R clcy lc lc lc lc E(xlc i |xi < cx , yi < cy ) = xfx,y (x, y) dy dx −∞ = lc lc lc P (xi < clc x , yi < cy ) −∞ R clcx R clcy −∞ −∞ xfx,y (x, y) dy dx R clcx R clcy −∞ −∞ fx,y (x, y) dy dx 86 R clcx R clcy lc lc lc E(yilc |xlc i < cx , yi < cy ) = yfx,y (x, y) dy dx −∞ −∞ lc lc lc P (xlc i < cx , yi < cy ) R clcx R clcy −∞ = R−∞ clc R clc x yfx,y (x, y) dy dx . y f (x, y) dy dx −∞ −∞ x,y lc Similarly, the second moments of xlc i and yi can be computed using the equations R clcx R clcy lc lc lc E(x2i (lc) |xlc i < cx , yi < cy ) = x2 fx,y (x, y) dy dx −∞ = lc lc lc P (xlc i < cx , yi < cy ) −∞ R clcx R clcy lc lc lc E(yi2 (lc) |xlc i < cx , yi < cy ) = y 2 fx,y (x, y) dy dx −∞ = lc lc lc P (xlc i < cx , yi < cy ) −∞ R clcx R clcy x2 fx,y (x, y) dy dx −∞ −∞ R clcx R clcy f (x, y) dy dx −∞ −∞ x,y R clcx R clcy y 2 fx,y (x, y) dy dx −∞ −∞ R clcx R clcy f (x, y) dy dx −∞ −∞ x,y where fx,y (x, y) is the joint PDFof x and y. The adaptIntegrate function of cubature package in R software [R Core Team, 2022] is used to numerically evaluate the integrals due to its complexity to obtain a closed–form solution. Computing the Total Expectation for Left Censored Data The total expectations in eq: 3.3 and 3.6 are obtained by summing up the expected values obtained from all 4 scenarios. ! p n m X X X (k) o o lc lc o s1 = Eθ(k) xi xi , yi , xi , yi = xi + xoi i=1 i=1 + + i=m+1 q X i=p+1 n X o lc lc (k) E xlc i | yi = y, xi < cx , θ  lc lc lc lc (k) E xlc i | xi < cx , yi < cy , θ  (A.18) i=q+1 (k) s2 = Eθ(k) n X i=1 ! lc yi xoi , yio , xlc i , yi = m X yio i=1 + p X (k) E yilc | xoi = x, yilc < clc y ,θ  i=m+1 q + + X i=p+1 n X yio lc lc lc (k) E yilc | xlc i < cx , yi < cy , θ  (A.19) i=q+1 87 s21 (k) = Eθ(k) n X ! lc x2i xoi , yio , xlc i , yi = i=1 m X x2i (o) + p X x2i (o) i=m+1 i=1 q X + + lc (k) E x2i (lc) | yio = y, xlc i < cx , θ i=p+1 n X  lc lc lc (k) E x2i (lc) | xlc i < cx , yi < cy , θ  i=q+1 (A.20) s22 (k) = Eθ(k) n X ! lc yi2 xoi , yio , xlc i , yi = i=1 m X yi2 (o) i=1 p X + (k) E yi2 (lc) | xoi = x, yilc < clc y ,θ  i=m+1 q X + + yi2 (o) i=p+1 n X lc lc lc (k) E yi2 (lc) | xlc i < cx , yi < cy , θ  i=q+1 (A.21) (k) s12 = Eθ(k) n X ! lc xi yi xoi , yio , xlc i , yi i=1 = m X xoi · yio i=1 + p X (k) xoi · E yilc | xoi = x, yilc < clc y ,θ  i=m+1 q + + X i=p+1 n X o lc lc (k) yio · E xlc i | yi = y, xi < cx , θ  lc lc lc lc (k) E xlc i | xi < cx , yi < cy , θ   i=q+1 lc lc lc (k) ·E yilc | xlc i < cx , yi < cy , θ  (A.22) A.2.2 M–step M–step is the next step of the algorithm and the sufficient statistics calculated in the E–step equations A.18–A.22 are maximised. The parameters are then updated 88 accordingly. The formulas for MLEs of complete data are obtained from formulas 2.8–2.11 in section 2.7. A.3 EM Algorithm for Interval Censored Observations This section explains the computation of the E–step and the M–step when the data contain interval censored observations. A.3.1 E–Step The E–step of the EM algorithm requires computing the expected complete data log–likelihood conditioned on the observed data and the current parameter estimates. Incorporating the censoring mechanism for interval censored data is also crucial in this computation. Let (x, y) be the two BVN random variables, θ be the parameter ic ic vector, and [lxic , uic x ] and [ly , uy ] are the lower and upper limits of the interval for variables x and y, respectively. The interval censored data is assumed to be present in the dataset according to 4 scenarios, “o” denotes observed data and “ic” denotes interval censored data as shown in Table A.3. Scenario Index x y 1 1 ···m xoi yio 2 m + 1 ···p xoi yiic 3 p + 1 ···q xic i yio 4 q + 1 ···n xic i yiic Table A.3: Scenarios with observed and interval censored data for x and y 89 Scenario 1 : xi observed, yi observed The E–step of the EM algorithm when observations are observed for both xi and yi are same as the scenario 1 of the missing observations. Therefore, it can be computed using the equations given in 3.15–3.19. Scenario 2 : xi observed, yi interval censored The E–step involves computing the sufficient statistics expressed in equations 3.2–3.6 when xi is observed while yi is interval censored at [lyic , uic y ] in a set of observations. The expected values of xi can be directly obtained from the observed x. However, the expected values of yiic need to be calculated conditioned on the value of xoi , censoring mechanism of lyic < yiic < uic y , and current parameter estimates. The computation of the E–step is expressed as p X Eθ(k) ! xoi xoi , yiic = i=m+1 p X Eθ(k) yiic xoi , yiic = i=m+1 (k) E(yiic |xoi = x, lyic < yiic < uic ) y ,θ m X ! x2i (o) xoi , yiic = i=1 Eθ(k) yi2 (ic) xoi , yiic p X x2i (o) i=m+1 p ! X = i=1 Eθ(k) (A.23) i=m+1 Eθ(k) m X xoi i=m+1 p X ! p X (k) E(yi2 (ic) |xoi = x, lyic < yiic < uic ) y ,θ (A.24) i=m+1 m X ! xoi yiic xoi , yiic = i=1 p X (k) xoi · E(yiic |xoi = x, lyic < yiic < uic ). y ,θ i=m+1 (k) (k) E(yiic |xoi = x, lyic < yiic < uic ) and E(yi2 (ic) |xoi = x, lyic < yiic < uic ) in equations y ,θ y ,θ A.23 and A.24 can be computed for interval censored observations using the conditional distribution of BVN distribution and interval censored normal distribution. (k) (k) Suppose ηy = (lyic −µy|x ) (k) σy|x and δy = (uic y −µy|x ) (k) σy|x (k) (k) and µy|x and σy|x are the mean and standard deviation of yi given xi = x at the k th iteration, respectively. It can be calculated from the equations A.3 and A.4. (k) (k) (k) ϕ(δy ) − ϕ(ηy ) 2 (k) E(yiic |xoi = x, lyic < yiic < uic ) = µy|x − σy|x y , µy|x , σy|x Φ(δy )Φ(ηy ) 90  δy ϕ(δy ) − ηy ϕ(ηy ) Φ(δy ) − Φ(ηy ) 2   ϕ(δy ) − ϕ(ηy ) − Φ(δy ) − Φ(ηy ) (k) 2 (k) yiic | xoi = x, lyic < yiic < uic y , µy|x , σy|x  2 (k) = σy|x (k) 2 (k) E(yi2 (ic) |xoi = x, lyic < yiic < uic )= y , µy|x , σy|x  (k) 2 (k) E(yiic |xoi = x, lyic < yiic < uic ) y , µy|x , σy|x Var  1− 2 (k) 2 (k) + V ar(yiic |xoi = x, lyic < yiic < uic ) y , µy|x , σy|x Scenario 3 : xi interval censored, yi observed This is similar to scenario 2 and the sufficient statistics need to be computed when xi is interval censored at [lxic , uic x ] while yi is observed in a set of observations. The expected values of yi can be directly obtained from the observed values in y and the interval censored xi is replaced with the expected values of xic i conditioned on the ic value of yio and lxic < xic i < ux . The computation of the E–step for the third scenario includes the formulas q X Eθ(k) ! ic o xic i xi , yi = i=p+1 q X o ic ic ic (k) E(xic ) i |yi = y, lx < xi < ux , θ q X Eθ(k) ! o yio xic i , yi = i=p+1 q X Eθ(k) ! o x2i (ic) xic i , yi = i=p+1 q X yio i=p+1 p X ic (k) E(x2i (ic) |yio = y, lxic < xic ) i < ux , θ (A.26) i=m+1 q X Eθ(k) ! o yi2 (o) xic i , yi i=p+1 Eθ(k) (A.25) i=p+1 q X yi2 (o) i=p+1 q X ! o ic o xic i yi xi , yi = q X = ic ic ic (k) o ). yio · E(xic i |yi = y, lx < xi < ux , θ i=p+1 i=p+1 o ic ic ic (k) ic (k) E(xic ) and E(x2i (ic) |yio = y, lxic < xic ) in equations i |yi = y, lx < xi < ux , θ i < ux , θ A.25 and A.26 can be computed for interval censored observations using the conditional distribution of BVN distribution and interval censored normal distribution. (k) (k) Suppose ηx = ic −µ (lx ) x|y (k) σx|y ) , δx = (uic x −µx|y ) (k) σx|y ) (k) (k) and µx|y and σx|y are the mean and standard ic th deviation of xi given yio = y and lxic < xic iteration, respectively. i < ux at the k (k) (k) (k) ϕ(δx ) − ϕ(ηx ) ic ic 2 (k) o ic ) = µx|y − σx|y E(xic i |yi = y, lx < xi < ux , µx|y , σx|y Φ(δx )Φ(ηx ) 91  δx ϕ(δx ) − ηx ϕ(ηx ) Φ(δx ) − Φ(ηx ) 2   ϕ(δx ) − ϕ(ηx ) − Φ(δx ) − Φ(ηx )   h  i2 (ic)2 (k) (k) ic 2 (k) ic o ic ic ic 2 (k) E xi |yio = y, lxic < xic < u , µ , σ = E x |y = y, l < x < u , µ , σ i x x|y i i x i x x|y x|y x|y   (k) o ic ic ic 2 (k) + Var xic |y = y, l < x < u , µ , σ i i x i x x|y x|y (k) o ic ic ic 2 (k) 2 (k) Var(xic ) = σx|y i |yi = y, lx < xi < ux , µx|y , σx|y 1− Scenario 4 : xi interval censored, yi interval censored The expected values of xi and yi cannot be calculated directly due to the interval censored observations in both variables. The expected value of xi and expected value ic ic of yi need to be computed conditioned on lxic < xi < uic x and ly < yi < uy in which ic ic [lxic , uic x ] and [ly , uy ] are lower and upper limits of the intervals of x and y, respectively. The computation of the E–step includes the formulas: ! n n X X ic ic ic ic ic ic ic ic (k) Eθ(k) xic x , y = E(xic ) i i i i |lx < xi < ux , ly < yi < uy , θ i=q+1 Eθ(k) i=q+1 ! n X ic yiic xic i , yi = i=q+1 Eθ(k) n X Eθ(k) ic x2i (ic) xic i , yi = n X ic ic ic ic (k) E(x2i (ic) |lxic < xic ) i < ux , ly < yi < uy , θ i=q+1 ! ic yi2 (ic) xic i , yi = i=q+1 Eθ(k) ic ic ic ic (k) E(yiic |lxic < xic ) i < ux , ly < yi < uy , θ i=q+1 ! i=q+1 n X n X n X i=q+1 n X ic ic ic ic (k) E(yi2 (ic) |lxic < xic ) i < ux , ly < yi < uy , θ i=q+1 ! ic ic ic xic i yi xi , yi = n X ic ic ic ic ic ic (k) E(xic i |lx < xi < ux , ly < yi < uy , θ i=q+1 ic ic ic ic (k) × E(yiic |lxic < xic ) i < ux , ly < yi < uy , θ The first moments of the BVN data with both variables being interval censored can be obtained using the integral formulas R uic R uic y x ic ic ic ic ic ic E(xic i |lx < xi < ux , ly < yi < uy ) = xfx,y (x, y) dy dx lyic ic ic ic ic P (lxic < xic i < ux , ly < yi < uy ) ic lx R uic R uic y x xfx,y (x, y) dy dx lic lic = Rxuic Ryuic y x fx,y (x, y) dy dx lic lic x y 92 R uic R uic y x ic ic ic ic E(yiic |lxic < xic i < ux , ly < yi < uy ) = yfx,y (x, y) dy dx lyic ic ic ic ic P (lxic < xic i < ux , ly < yi < uy ) ic lx R uic R uic y x yfx,y (x, y) dy dx lic lic = Rxuic Ryuic y x fx,y (x, y) dy dx lic lic y x Similarly, the second moments of xi and yi can be computed using the equations R uic R uic y x ic ic ic ic E(x2i ic |lxic < xic i < ux , ly < yi < uy ) = x2 fx,y (x, y) dy dx lyic ic ic ic ic P (lxic < xic i < ux , ly < yi < uy ) ic lx R uic R uic y x lyic ic lx x2 fx,y (x, y) dy dx = R uic R uic y x lyic ic lx fx,y (x, y) dy dx R uic R uic y x ic ic ic ic E(yi2 ic |lxic < xic i < ux , ly < yi < uy ) = y 2 fx,y (x, y) dy dx lyic ic ic ic ic P (lxic < xic i < ux , ly < yi < uy ) ic lx R uic R uic y x ic lx lyic y 2 fx,y (x, y) dy dx lyic fx,y (x, y) dy dx = R uic R uic y x ic lx where fx,y (x, y) is the joint PDFof x and y. The adaptIntegrate function of cubature package in R software [R Core Team, 2022] is used to numerically evaluate the integrals due to its complexity to obtain a closed–form solution. Computing the Total Expectation for Interval Censored Data The total expectations in equations 3.2 and 3.6 are obtained by summing up the expected values obtained from all 4 scenarios. ! p m n X X X (k) ic ic o o o s1 = Eθ(k) xi + xoi xi xi , yi , xi , yi = i=1 q i=1 + + X i=p+1 n X i=m+1 ic (k) o ic ic ) E(xic i |yi = y, lx < xi < ux , θ ic ic ic ic ic ic (k) E(xic ) i |lx < xi < ux , ly < yi < uy , θ i=q+1 (A.27) 93 (k) s2 = Eθ(k) n X ! ic yi xoi , yio , xic i , yi = i=1 m X p X yio + i=m+1 i=1 q X + yio i=p+1 n X + (k) E(yiic |xoi = x, lyic < yiic < uic ) y ,θ ic ic ic ic (k) E(yi |lxic < xic ) i < ux , ly < yi < uy , θ i=q+1 (A.28) s21 (k) = Eθ(k) n X ! ic x2i (ic) xoi , yio , xic i , yi = m X p x2i (ic) + i=1 q i=1 X + + i=p+1 n X X x2i (ic) i=m+1 ic (k) ) E(x2i (ic) |yio = y, lxic < xic i < ux , θ ic ic ic ic (k) E(x2i (ic) |lxic < xic ) i < ux , ly < yi < uy , θ i=q+1 (A.29) s22 (k) = Eθ(k) n X ! ic yi2 xoi , yio , xic i , yi = i=1 m X yi2 (o) i=1 p + X (k) E(yi2 (ic) |xoi = x, lyic < yiic < uic ) y ,θ i=m+1 q + + X i=p+1 n X yi2 (o) ic ic ic ic (k) E(yi2 (ic) |lxic < xic ) i < ux , ly < yi < uy , θ i=q+1 (A.30) (k) s12 = Eθ(k) n X ! ic xi yi xoi , yio , xic i , yi = i=1 m X xi · yi + i=1 q + + X i=p+1 n X p X (k) xoi · E(yiic |xoi = x, lyic < yiic < uic ) y ,θ i=m+1 o ic ic ic (k) yio · E(xic ) i |yi = y, lx < xi < ux , θ  ic ic ic ic ic ic (k) E(xic ) i |lx < xi < ux , ly < yi < uy , θ i=q+1 ic ic ic ic (k) × E(yiic |lxic < xic )] i < ux , ly < yi < uy , θ (A.31) 94 A.3.2 M–step M–step is the next step of the algorithm and the sufficient statistics calculated in the E–step equations A.27–A.31 are maximised. The parameters are then updated accordingly. The formulas for MLEs of complete data are obtained from formulas 2.8–2.11 in section 2.7. 95 Appendix B Additional Simulation Results B.1 Results in the Presence of Missing observations in the Data Figure B.1: Boxplots for estimates of mean of x (µˆx ) and mean of y (µˆy ), when n = 100, ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 96 Figure B.2: Boxplots for estimates of standard deviation of x (σˆx ) and standard deviation of y (σˆy ), when n = 100, ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 Figure B.3: Boxplots for estimates of correlation between x and y (ρ̂), when n = 100, ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 97 Table B.1: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.460 0.515 0.481 0.545 0.594 0.547 0.3 0.460 0.697 0.592 0.545 0.786 0.536 0.5 0.460 0.848 0.630 0.545 0.963 0.514 0.1 0.215 0.239 0.231 0.267 0.296 0.267 0.3 0.215 0.305 0.263 0.267 0.370 0.263 0.5 0.215 0.444 0.345 0.267 0.478 0.253 0.1 0.052 0.056 0.055 0.052 0.058 0.053 0.3 0.052 0.075 0.065 0.052 0.073 0.052 0.5 0.052 0.087 0.067 0.052 0.093 0.050 Table B.2: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.006 0.006 0.006 0.005 0.006 0.005 0.3 0.006 0.008 0.007 0.005 0.008 0.006 0.5 0.006 0.009 0.009 0.005 0.010 0.006 0.1 0.003 0.003 0.003 0.003 0.003 0.003 0.3 0.003 0.003 0.003 0.003 0.004 0.003 0.5 0.003 0.004 0.004 0.003 0.005 0.003 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.001 0.001 0.001 0.001 0.001 0.5 0.001 0.001 0.001 0.001 0.001 0.001 98 Table B.3: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.955 2.211 2.067 1.984 2.151 1.938 0.3 1.955 2.757 2.296 1.984 2.803 1.931 0.5 1.955 3.598 2.581 1.984 3.423 1.933 0.1 0.958 1.065 0.964 0.997 1.103 0.987 0.3 0.958 1.323 1.137 0.997 1.382 0.984 0.5 0.958 1.734 1.255 0.997 1.763 0.985 0.1 0.206 0.226 0.212 0.200 0.222 0.200 0.3 0.206 0.278 0.245 0.200 0.277 0.200 0.5 0.206 0.361 0.276 0.200 0.356 0.199 Table B.4: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) for ρ = 0.2 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.099 0.121 0.116 1.412 1.472 1.397 0.95 0.93 0.93 1.01 1.05 0.3 0.099 0.135 0.120 1.412 1.680 1.395 0.95 0.95 0.90 1.01 1.20 0.5 0.099 0.097 0.148 1.412 1.853 1.398 0.95 0.95 0.89 1.01 1.33 0.1 0.064 0.052 0.069 1.001 1.051 0.996 0.96 0.95 0.94 1.00 1.06 0.3 0.064 0.064 0.051 1.001 1.177 0.993 0.96 0.96 0.93 1.01 1.19 0.5 0.064 0.043 0.047 1.001 1.328 0.994 0.96 0.94 0.92 1.01 1.34 0.1 0.027 0.011 0.020 0.448 0.471 0.447 0.94 0.94 0.94 1.00 1.05 0.3 0.027 0.001 0.012 0.448 0.526 0.447 0.94 0.93 0.92 1.00 1.18 0.5 0.027 0.010 0.002 0.448 0.596 0.447 0.94 0.95 0.89 1.00 1.34 99 Table B.5: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.444 0.497 0.469 0.505 0.549 0.496 0.3 0.444 0.668 0.548 0.505 0.720 0.494 0.5 0.444 0.815 0.579 0.505 0.874 0.489 0.1 0.216 0.236 0.229 0.251 0.277 0.248 0.3 0.216 0.296 0.249 0.251 0.346 0.248 0.5 0.216 0.436 0.320 0.251 0.444 0.248 0.1 0.051 0.054 0.052 0.050 0.055 0.050 0.3 0.051 0.073 0.061 0.050 0.069 0.050 0.5 0.051 0.088 0.064 0.050 0.089 0.050 Table B.6: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) for ρ = 0.2 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.005 -0.007 0.008 0.711 0.741 0.704 0.95 0.96 0.95 1.01 1.05 0.3 0.005 0.003 0.006 0.711 0.849 0.703 0.95 0.95 0.93 1.01 1.21 0.5 0.005 0.000 -0.006 0.711 0.935 0.699 0.95 0.95 0.91 1.02 1.34 0.1 0.015 0.017 0.012 0.501 0.527 0.498 0.97 0.97 0.97 1.01 1.06 0.3 0.015 0.019 0.013 0.501 0.589 0.498 0.97 0.95 0.95 1.01 1.18 0.5 0.015 0.025 0.019 0.501 0.667 0.498 0.97 0.96 0.92 1.01 1.34 0.1 0.001 0.004 0.002 0.223 0.235 0.223 0.93 0.95 0.94 1.00 1.05 0.3 0.001 0.002 0.002 0.223 0.263 0.223 0.93 0.94 0.91 1.00 1.18 0.5 0.001 -0.007 -0.005 0.223 0.298 0.223 0.93 0.95 0.92 1.00 1.34 100 Table B.7: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.104 1.164 1.122 1.023 1.112 0.971 0.3 1.104 1.585 1.228 1.023 1.466 1.002 0.5 1.104 1.971 1.410 1.023 1.809 1.052 0.1 0.551 0.597 0.556 0.506 0.561 0.495 0.3 0.551 0.810 0.634 0.506 0.706 0.508 0.5 0.551 0.876 0.723 0.506 0.906 0.540 0.1 0.102 0.113 0.106 0.100 0.111 0.100 0.3 0.102 0.145 0.120 0.100 0.139 0.103 0.5 0.102 0.183 0.138 0.100 0.179 0.109 Table B.8: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) for ρ = 0.2 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.096 -0.102 -0.214 1.016 1.060 1.009 0.94 0.94 0.93 1.01 1.05 0.3 -0.096 -0.135 -0.239 1.016 1.218 1.029 0.94 0.93 0.89 0.99 1.18 0.5 -0.096 -0.150 -0.249 1.016 1.353 1.056 0.94 0.91 0.88 0.96 1.28 0.1 -0.043 -0.056 -0.095 0.713 0.751 0.710 0.94 0.94 0.92 1.00 1.06 0.3 -0.043 -0.050 -0.115 0.713 0.842 0.722 0.94 0.94 0.91 0.99 1.17 0.5 -0.043 -0.078 -0.119 0.713 0.955 0.745 0.94 0.95 0.89 0.96 1.28 0.1 -0.008 -0.008 -0.017 0.317 0.333 0.317 0.95 0.95 0.95 1.00 1.05 0.3 -0.008 -0.004 -0.015 0.317 0.373 0.321 0.95 0.95 0.94 0.99 1.16 0.5 -0.008 -0.008 -0.028 0.317 0.423 0.332 0.95 0.95 0.92 0.96 1.28 101 Table B.9: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.257 0.284 0.265 0.261 0.284 0.249 0.3 0.257 0.378 0.300 0.261 0.376 0.257 0.5 0.257 0.454 0.331 0.261 0.462 0.268 0.1 0.126 0.136 0.128 0.127 0.141 0.125 0.3 0.126 0.177 0.148 0.127 0.177 0.128 0.5 0.126 0.222 0.168 0.127 0.228 0.137 0.1 0.025 0.027 0.026 0.025 0.028 0.025 0.3 0.025 0.033 0.028 0.025 0.035 0.026 0.5 0.025 0.042 0.034 0.025 0.045 0.027 Table B.10: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) for ρ = 0.2 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.000 0.002 -0.048 0.511 0.533 0.501 0.94 0.95 0.93 1.02 1.06 0.3 0.000 0.002 -0.060 0.511 0.614 0.511 0.94 0.94 0.91 1.00 1.20 0.5 0.000 -0.019 -0.092 0.511 0.680 0.526 0.94 0.93 0.88 0.97 1.29 0.1 -0.005 -0.012 -0.031 0.357 0.376 0.354 0.94 0.94 0.94 1.01 1.06 0.3 -0.005 -0.017 -0.036 0.357 0.421 0.360 0.94 0.93 0.93 0.99 1.17 0.5 -0.005 -0.024 -0.041 0.357 0.478 0.372 0.94 0.95 0.91 0.96 1.29 0.1 -0.008 -0.002 -0.010 0.158 0.167 0.158 0.95 0.96 0.95 1.00 1.05 0.3 -0.008 -0.007 -0.019 0.158 0.186 0.161 0.95 0.96 0.93 0.98 1.15 0.5 -0.008 -0.007 -0.013 0.158 0.211 0.166 0.95 0.96 0.93 0.95 1.27 102 Table B.11: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.020 0.020 0.021 0.019 0.020 0.018 0.3 0.020 0.027 0.027 0.019 0.026 0.018 0.5 0.020 0.033 0.033 0.019 0.032 0.020 0.1 0.010 0.010 0.010 0.009 0.010 0.009 0.3 0.010 0.013 0.013 0.009 0.013 0.009 0.5 0.010 0.016 0.016 0.009 0.017 0.010 0.1 0.002 0.002 0.002 0.002 0.002 0.002 0.3 0.002 0.003 0.003 0.002 0.003 0.002 0.5 0.002 0.004 0.004 0.002 0.003 0.002 Table B.12: Bias, RMSE, 95% CP, and relative RMSE of correlation(ρ̂) for ρ = 0.2 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.005 -0.003 -0.003 0.137 0.142 0.133 0.93 0.94 0.92 1.02 1.07 0.3 -0.005 -0.002 -0.001 0.137 0.163 0.136 0.93 0.93 0.87 1.01 1.20 0.5 -0.005 -0.002 -0.002 0.137 0.180 0.140 0.93 0.93 0.84 0.98 1.29 0.1 -0.004 -0.008 -0.007 0.097 0.102 0.096 0.93 0.94 0.92 1.01 1.06 0.3 -0.004 -0.006 -0.005 0.097 0.114 0.097 0.93 0.94 0.89 0.99 1.17 0.5 -0.004 -0.011 -0.009 0.097 0.129 0.101 0.93 0.95 0.87 0.95 1.28 0.1 0.000 0.001 0.001 0.042 0.045 0.042 0.95 0.95 0.94 1.00 1.05 0.3 0.000 0.001 0.001 0.042 0.050 0.044 0.95 0.94 0.91 0.97 1.15 0.5 0.000 0.000 0.000 0.042 0.057 0.046 0.95 0.94 0.86 0.93 1.25 103 Table B.13: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.4 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.400 0.452 0.425 0.476 0.519 0.478 0.3 0.400 0.610 0.530 0.476 0.687 0.468 0.5 0.400 0.742 0.569 0.476 0.841 0.450 0.1 0.189 0.211 0.204 0.234 0.259 0.234 0.3 0.189 0.269 0.236 0.234 0.324 0.230 0.5 0.189 0.389 0.313 0.234 0.418 0.222 0.1 0.045 0.049 0.048 0.046 0.051 0.046 0.3 0.045 0.065 0.057 0.046 0.063 0.045 0.5 0.045 0.075 0.059 0.046 0.082 0.044 Table B.14: Bias, RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) when ρ = 0.4 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.003 -0.023 -0.013 0.690 0.721 0.691 0.96 0.96 0.96 1.00 1.04 0.3 -0.003 -0.020 -0.019 0.690 0.829 0.684 0.96 0.95 0.93 1.01 1.21 0.5 -0.003 -0.024 -0.030 0.690 0.918 0.671 0.96 0.95 0.91 1.03 1.37 0.1 0.010 0.018 0.010 0.483 0.509 0.484 0.97 0.96 0.96 1.00 1.05 0.3 0.010 0.017 0.009 0.483 0.569 0.480 0.97 0.97 0.94 1.01 1.19 0.5 0.010 0.029 0.020 0.483 0.647 0.472 0.97 0.95 0.89 1.02 1.37 0.1 -0.003 0.001 -0.001 0.214 0.226 0.214 0.95 0.95 0.95 1.00 1.05 0.3 -0.003 0.001 0.002 0.214 0.252 0.212 0.95 0.95 0.91 1.01 1.19 0.5 -0.003 -0.009 -0.004 0.214 0.286 0.209 0.95 0.95 0.91 1.02 1.37 104 Table B.15: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.4 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 in the presence of missing data n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.005 0.005 0.005 0.005 0.005 0.005 0.3 0.005 0.007 0.007 0.005 0.007 0.005 0.5 0.005 0.008 0.008 0.005 0.008 0.005 0.1 0.002 0.002 0.002 0.002 0.002 0.002 0.3 0.002 0.003 0.003 0.002 0.003 0.003 0.5 0.002 0.004 0.004 0.002 0.004 0.003 0.1 0.000 0.001 0.001 0.000 0.001 0.000 0.3 0.000 0.001 0.001 0.000 0.001 0.001 0.5 0.000 0.001 0.001 0.000 0.001 0.001 Table B.16: Bias, RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) when ρ = 0.4 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 0.002 0.002 0.068 0.071 0.069 0.93 0.94 0.94 0.99 1.03 0.3 0.001 0.004 0.004 0.068 0.082 0.072 0.93 0.94 0.91 0.94 1.13 0.5 0.001 0.003 0.003 0.068 0.092 0.074 0.93 0.94 0.89 0.92 1.25 0.1 -0.001 -0.002 -0.002 0.047 0.049 0.048 0.94 0.96 0.95 0.98 1.02 0.3 -0.001 -0.001 -0.000 0.047 0.056 0.050 0.94 0.95 0.92 0.94 1.11 0.5 -0.001 -0.003 -0.002 0.047 0.063 0.051 0.94 0.95 0.89 0.92 1.24 0.1 0.000 0.001 0.000 0.020 0.022 0.020 0.96 0.96 0.95 1.00 1.12 0.3 0.000 0.000 0.000 0.020 0.025 0.022 0.96 0.94 0.92 0.89 1.09 0.5 0.000 0.000 0.000 0.020 0.028 0.022 0.96 0.93 0.90 0.89 1.26 105 Table B.17: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.4 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.955 2.205 2.058 1.984 2.151 1.939 0.3 1.955 2.752 2.271 1.984 2.805 1.932 0.5 1.955 3.600 2.549 1.984 3.426 1.934 0.1 0.948 1.054 0.952 0.997 1.102 0.987 0.3 0.948 1.311 1.102 0.997 1.381 0.984 0.5 0.948 1.728 1.211 0.997 1.760 0.985 0.1 0.207 0.227 0.212 0.200 0.222 0.200 0.3 0.207 0.280 0.242 0.200 0.277 0.200 0.5 0.207 0.365 0.270 0.200 0.356 0.200 Table B.18: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.4 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.099 0.118 0.117 1.412 1.472 1.397 0.94 0.93 0.93 1.01 1.05 0.3 0.099 0.133 0.119 1.412 1.680 1.395 0.94 0.95 0.91 1.01 1.20 0.5 0.099 0.095 0.144 1.412 1.853 1.398 0.94 0.94 0.90 1.01 1.33 0.1 0.065 0.053 0.068 1.000 1.051 0.996 0.96 0.96 0.95 1.00 1.06 0.3 0.065 0.066 0.050 1.000 1.177 0.993 0.96 0.96 0.93 1.01 1.19 0.5 0.065 0.046 0.045 1.000 1.328 0.994 0.96 0.94 0.92 1.01 1.34 0.1 0.027 0.011 0.020 0.448 0.471 0.447 0.94 0.94 0.93 1.00 1.05 0.3 0.027 0.002 0.012 0.448 0.526 0.447 0.94 0.94 0.92 1.00 1.18 0.5 0.027 0.009 0.004 0.448 0.596 0.447 0.94 0.94 0.91 1.00 1.33 106 Table B.19: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.4 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.448 0.499 0.470 0.504 0.548 0.495 0.3 0.448 0.668 0.541 0.504 0.719 0.494 0.5 0.448 0.823 0.574 0.504 0.874 0.489 0.1 0.214 0.233 0.225 0.251 0.277 0.248 0.3 0.214 0.294 0.245 0.251 0.346 0.248 0.5 0.214 0.433 0.308 0.251 0.443 0.248 0.1 0.051 0.055 0.053 0.050 0.055 0.050 0.3 0.051 0.073 0.060 0.050 0.069 0.050 0.5 0.051 0.090 0.064 0.050 0.089 0.050 Table B.20: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.4 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.012 0.002 0.014 0.710 0.741 0.704 0.95 0.96 0.95 1.01 1.05 0.3 0.012 0.013 0.014 0.710 0.848 0.703 0.95 0.95 0.93 1.01 1.21 0.5 0.012 0.007 0.005 0.710 0.935 0.699 0.95 0.96 0.93 1.02 1.34 0.1 0.019 0.020 0.017 0.501 0.526 0.498 0.97 0.97 0.97 1.01 1.06 0.3 0.019 0.023 0.017 0.501 0.589 0.498 0.97 0.96 0.95 1.01 1.18 0.5 0.019 0.027 0.021 0.501 0.666 0.498 0.97 0.95 0.92 1.01 1.34 0.1 0.003 0.005 0.004 0.223 0.235 0.223 0.93 0.95 0.94 1.00 1.05 0.3 0.003 0.002 0.004 0.223 0.263 0.223 0.93 0.95 0.92 1.00 1.18 0.5 0.003 -0.006 -0.003 0.223 0.298 0.223 0.93 0.95 0.92 1.00 1.33 107 Table B.21: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.4 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.096 1.155 1.115 1.025 1.115 0.972 0.3 1.096 1.570 1.217 1.025 1.471 1.004 0.5 1.096 1.968 1.402 1.025 1.817 1.057 0.1 0.546 0.593 0.554 0.507 0.561 0.495 0.3 0.546 0.804 0.625 0.507 0.706 0.508 0.5 0.546 0.868 0.715 0.507 0.906 0.543 0.1 0.103 0.114 0.107 0.100 0.111 0.100 0.3 0.103 0.146 0.122 0.100 0.139 0.103 0.5 0.103 0.183 0.139 0.100 0.179 0.110 Table B.22: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.4 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.096 -0.100 -0.212 1.017 1.061 1.008 0.94 0.94 0.92 1.01 1.05 0.3 -0.096 -0.131 -0.236 1.017 1.220 1.030 0.94 0.93 0.89 0.99 1.18 0.5 -0.096 -0.146 -0.247 1.017 1.356 1.058 0.94 0.93 0.89 0.96 1.28 0.1 -0.044 -0.059 -0.096 0.713 0.751 0.710 0.93 0.93 0.92 1.00 1.06 0.3 -0.044 -0.053 -0.115 0.713 0.842 0.722 0.93 0.94 0.91 0.99 1.17 0.5 -0.044 -0.084 -0.119 0.713 0.956 0.747 0.93 0.94 0.90 0.96 1.28 0.1 -0.008 -0.008 -0.017 0.317 0.334 0.317 0.95 0.95 0.94 1.00 1.05 0.3 -0.008 -0.003 -0.015 0.317 0.373 0.322 0.95 0.95 0.93 0.99 1.16 0.5 -0.008 -0.008 -0.027 0.317 0.423 0.333 0.95 0.95 0.91 0.95 1.27 108 Table B.23: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.4 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.264 0.291 0.272 0.261 0.284 0.248 0.3 0.264 0.383 0.310 0.261 0.377 0.257 0.5 0.264 0.462 0.338 0.261 0.464 0.269 0.1 0.127 0.135 0.129 0.127 0.141 0.124 0.3 0.127 0.179 0.150 0.127 0.177 0.128 0.5 0.127 0.220 0.165 0.127 0.228 0.137 0.1 0.024 0.027 0.026 0.025 0.028 0.025 0.3 0.024 0.032 0.028 0.025 0.035 0.026 0.5 0.024 0.042 0.034 0.025 0.045 0.028 Table B.24: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.4 over sample sizes 50, 100, and 500 and missing percentage of 10%, 30%, and 50% n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.004 -0.001 -0.052 0.511 0.533 0.501 0.93 0.95 0.91 1.02 1.06 0.3 -0.004 -0.001 -0.063 0.511 0.614 0.511 0.93 0.93 0.90 1.00 1.20 0.5 -0.004 -0.020 -0.094 0.511 0.681 0.527 0.93 0.93 0.89 0.97 1.29 0.1 -0.008 -0.017 -0.035 0.357 0.376 0.354 0.94 0.94 0.93 1.01 1.06 0.3 -0.008 -0.020 -0.038 0.357 0.421 0.360 0.94 0.94 0.92 0.99 1.17 0.5 -0.008 -0.031 -0.042 0.357 0.478 0.373 0.94 0.94 0.92 0.96 1.28 0.1 -0.007 -0.001 -0.009 0.158 0.167 0.158 0.96 0.96 0.95 1.00 1.05 0.3 -0.007 -0.006 -0.017 0.158 0.186 0.161 0.96 0.96 0.94 0.98 1.16 0.5 -0.007 -0.006 -0.013 0.158 0.211 0.167 0.96 0.97 0.93 0.95 1.27 109 Table B.25: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.4 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.015 0.016 0.016 0.015 0.016 0.014 0.3 0.015 0.021 0.021 0.015 0.021 0.014 0.5 0.015 0.026 0.025 0.015 0.026 0.015 0.1 0.007 0.008 0.008 0.007 0.008 0.007 0.3 0.007 0.010 0.010 0.007 0.010 0.007 0.5 0.007 0.012 0.012 0.007 0.013 0.008 0.1 0.002 0.002 0.002 0.001 0.002 0.001 0.3 0.002 0.002 0.002 0.001 0.002 0.002 0.5 0.002 0.003 0.003 0.001 0.003 0.002 Table B.26: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.4 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.007 -0.005 -0.005 0.121 0.126 0.118 0.93 0.93 0.92 1.03 1.07 0.3 -0.007 -0.005 -0.003 0.121 0.145 0.120 0.93 0.93 0.87 1.01 1.21 0.5 -0.007 -0.005 -0.003 0.121 0.160 0.124 0.93 0.93 0.86 0.98 1.29 0.1 -0.005 -0.008 -0.006 0.085 0.090 0.085 0.93 0.95 0.93 1.01 1.06 0.3 -0.005 -0.006 -0.005 0.085 0.100 0.086 0.93 0.94 0.90 0.99 1.17 0.5 -0.005 -0.011 -0.007 0.085 0.115 0.090 0.93 0.94 0.88 0.95 1.28 0.1 0.000 0.000 0.000 0.037 0.040 0.037 0.95 0.95 0.94 1.00 1.07 0.3 0.000 0.001 0.001 0.037 0.045 0.039 0.95 0.94 0.90 0.97 1.16 0.5 0.000 -0.000 -0.000 0.037 0.050 0.040 0.95 0.94 0.86 0.94 1.25 110 Table B.27: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.6 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.304 0.346 0.330 0.363 0.395 0.363 0.3 0.304 0.466 0.420 0.363 0.522 0.356 0.5 0.304 0.568 0.461 0.363 0.640 0.343 0.1 0.145 0.162 0.157 0.178 0.197 0.178 0.3 0.145 0.207 0.186 0.178 0.247 0.175 0.5 0.145 0.298 0.251 0.178 0.319 0.170 0.1 0.034 0.037 0.036 0.035 0.039 0.035 0.3 0.034 0.049 0.044 0.035 0.048 0.034 0.5 0.034 0.057 0.047 0.035 0.062 0.033 Table B.28: Bias, RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) when ρ = 0.6 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.008 -0.026 -0.019 0.602 0.629 0.603 0.97 0.96 0.95 1.00 1.04 0.3 -0.008 -0.023 -0.024 0.602 0.723 0.597 0.97 0.94 0.92 1.01 1.21 0.5 -0.008 -0.026 -0.034 0.602 0.800 0.587 0.97 0.95 0.90 1.03 1.36 0.1 0.006 0.012 0.006 0.422 0.445 0.422 0.97 0.96 0.96 1.00 1.05 0.3 0.006 0.012 0.005 0.422 0.497 0.419 0.97 0.96 0.93 1.01 1.19 0.5 0.006 0.023 0.013 0.422 0.565 0.412 0.97 0.96 0.88 1.02 1.37 0.1 -0.003 0.000 -0.000 0.187 0.197 0.187 0.94 0.96 0.95 1.00 1.05 0.3 -0.003 0.001 0.003 0.187 0.220 0.186 0.94 0.95 0.91 1.01 1.18 0.5 -0.003 -0.008 -0.003 0.187 0.250 0.183 0.94 0.95 0.90 1.02 1.36 111 Table B.29: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.6 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.004 0.004 0.004 0.004 0.004 0.004 0.3 0.004 0.005 0.005 0.004 0.005 0.004 0.5 0.004 0.006 0.006 0.004 0.006 0.004 0.1 0.002 0.002 0.002 0.002 0.002 0.002 0.3 0.002 0.003 0.002 0.002 0.002 0.002 0.5 0.002 0.003 0.003 0.002 0.003 0.002 0.1 0.000 0.000 0.000 0.000 0.000 0.000 0.3 0.000 0.001 0.000 0.000 0.000 0.000 0.5 0.000 0.001 0.001 0.000 0.001 0.000 Table B.30: Bias, RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) when ρ = 0.6 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 0.002 0.003 0.059 0.062 0.060 0.94 0.94 0.93 0.99 1.03 0.3 0.001 0.004 0.005 0.059 0.072 0.062 0.94 0.94 0.91 0.96 1.16 0.5 0.001 0.004 0.004 0.059 0.080 0.063 0.94 0.94 0.88 0.93 1.25 0.1 -0.000 -0.002 -0.001 0.041 0.044 0.041 0.93 0.95 0.94 1.00 1.06 0.3 -0.000 -0.001 0.001 0.041 0.048 0.042 0.93 0.94 0.91 0.97 1.13 0.5 -0.000 -0.002 -0.000 0.041 0.055 0.044 0.93 0.95 0.91 0.95 1.26 0.1 0.000 0.000 0.000 0.017 0.020 0.017 0.96 0.96 0.95 1.00 1.16 0.3 0.000 0.000 -0.000 0.017 0.020 0.020 0.96 0.95 0.92 0.87 1.00 0.5 0.000 0.000 0.000 0.017 0.025 0.020 0.96 0.94 0.90 0.87 1.23 112 Table B.31: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.6 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.954 2.196 2.040 1.984 2.153 1.940 0.3 1.954 2.746 2.218 1.984 2.807 1.934 0.5 1.954 3.598 2.470 1.984 3.430 1.936 0.1 0.937 1.041 0.938 0.996 1.101 0.987 0.3 0.937 1.298 1.056 0.996 1.380 0.984 0.5 0.937 1.723 1.149 0.996 1.758 0.987 0.1 0.208 0.227 0.211 0.200 0.222 0.200 0.3 0.208 0.282 0.237 0.200 0.277 0.200 0.5 0.208 0.370 0.260 0.200 0.356 0.200 Table B.32: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.6 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.097 0.114 0.116 1.412 1.472 1.398 0.94 0.93 0.93 1.01 1.05 0.3 0.097 0.130 0.115 1.412 1.681 1.395 0.94 0.96 0.92 1.01 1.20 0.5 0.097 0.093 0.136 1.412 1.854 1.398 0.94 0.94 0.91 1.01 1.33 0.1 0.066 0.055 0.066 1.000 1.051 0.996 0.95 0.95 0.94 1.00 1.06 0.3 0.066 0.067 0.049 1.000 1.177 0.993 0.95 0.95 0.93 1.01 1.18 0.5 0.066 0.049 0.042 1.000 1.327 0.994 0.95 0.94 0.93 1.01 1.33 0.1 0.026 0.012 0.019 0.448 0.471 0.447 0.94 0.93 0.93 1.00 1.05 0.3 0.026 0.002 0.013 0.448 0.526 0.447 0.94 0.93 0.93 1.00 1.18 0.5 0.026 0.008 0.007 0.448 0.596 0.447 0.94 0.94 0.91 1.00 1.33 113 Table B.33: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.6 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.454 0.503 0.472 0.503 0.547 0.494 0.3 0.454 0.668 0.529 0.503 0.717 0.493 0.5 0.454 0.835 0.560 0.503 0.873 0.489 0.1 0.212 0.232 0.221 0.250 0.276 0.247 0.3 0.212 0.293 0.239 0.250 0.345 0.248 0.5 0.212 0.430 0.289 0.250 0.441 0.248 0.1 0.052 0.056 0.053 0.050 0.056 0.050 0.3 0.052 0.074 0.059 0.050 0.069 0.050 0.5 0.052 0.093 0.063 0.050 0.089 0.050 Table B.34: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.6 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.019 0.012 0.021 0.710 0.740 0.703 0.96 0.96 0.95 1.01 1.05 0.3 0.019 0.023 0.021 0.710 0.847 0.703 0.96 0.95 0.94 1.01 1.21 0.5 0.019 0.015 0.016 0.710 0.934 0.700 0.96 0.95 0.92 1.01 1.34 0.1 0.023 0.023 0.022 0.501 0.526 0.498 0.97 0.97 0.97 1.01 1.06 0.3 0.023 0.027 0.020 0.501 0.588 0.498 0.97 0.97 0.96 1.01 1.18 0.5 0.023 0.029 0.023 0.501 0.665 0.498 0.97 0.95 0.94 1.00 1.33 0.1 0.005 0.005 0.006 0.223 0.236 0.224 0.93 0.95 0.94 1.00 1.05 0.3 0.005 0.002 0.006 0.223 0.263 0.223 0.93 0.95 0.92 1.00 1.18 0.5 0.005 -0.005 -0.001 0.223 0.298 0.223 0.93 0.95 0.91 1.00 1.33 114 Table B.35: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.6 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.088 1.146 1.106 1.030 1.121 0.973 0.3 1.088 1.552 1.197 1.030 1.482 1.008 0.5 1.088 1.958 1.376 1.030 1.834 1.066 0.1 0.540 0.587 0.550 0.507 0.562 0.495 0.3 0.540 0.795 0.615 0.507 0.708 0.510 0.5 0.540 0.860 0.700 0.507 0.909 0.549 0.1 0.104 0.116 0.108 0.100 0.111 0.100 0.3 0.104 0.147 0.122 0.100 0.139 0.104 0.5 0.104 0.184 0.138 0.100 0.179 0.111 Table B.36: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.6 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.095 -0.097 -0.208 1.019 1.063 1.008 0.94 0.93 0.91 1.01 1.05 0.3 -0.095 -0.126 -0.231 1.019 1.224 1.030 0.94 0.92 0.90 0.99 1.19 0.5 -0.095 -0.141 -0.244 1.019 1.362 1.061 0.94 0.93 0.89 0.96 1.28 0.1 -0.045 -0.063 -0.095 0.714 0.752 0.710 0.93 0.93 0.93 1.01 1.06 0.3 -0.045 -0.055 -0.111 0.714 0.843 0.723 0.93 0.93 0.91 0.99 1.17 0.5 -0.045 -0.089 -0.113 0.714 0.957 0.749 0.93 0.94 0.91 0.95 1.28 0.1 -0.008 -0.007 -0.018 0.317 0.334 0.317 0.94 0.95 0.94 1.00 1.05 0.3 -0.008 -0.003 -0.016 0.317 0.373 0.322 0.94 0.95 0.93 0.98 1.16 0.5 -0.008 -0.008 -0.026 0.317 0.423 0.334 0.94 0.94 0.92 0.95 1.27 115 Table B.37: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.6 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.271 0.296 0.278 0.261 0.285 0.248 0.3 0.271 0.385 0.314 0.261 0.379 0.258 0.5 0.271 0.467 0.338 0.261 0.467 0.271 0.1 0.127 0.134 0.129 0.127 0.141 0.124 0.3 0.127 0.181 0.151 0.127 0.177 0.129 0.5 0.127 0.218 0.160 0.127 0.228 0.138 0.1 0.024 0.027 0.025 0.025 0.028 0.025 0.3 0.024 0.033 0.028 0.025 0.035 0.026 0.5 0.024 0.042 0.033 0.025 0.045 0.028 Table B.38: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.6 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.011 -0.007 -0.059 0.511 0.534 0.501 0.93 0.94 0.93 1.02 1.07 0.3 -0.011 -0.007 -0.067 0.511 0.615 0.512 0.93 0.94 0.91 1.00 1.20 0.5 -0.011 -0.024 -0.095 0.511 0.684 0.529 0.93 0.93 0.89 0.97 1.29 0.1 -0.011 -0.023 -0.039 0.357 0.376 0.355 0.94 0.94 0.94 1.01 1.06 0.3 -0.011 -0.024 -0.040 0.357 0.422 0.361 0.94 0.93 0.92 0.99 1.17 0.5 -0.011 -0.038 -0.044 0.357 0.479 0.375 0.94 0.95 0.93 0.95 1.28 0.1 -0.007 -0.001 -0.009 0.158 0.167 0.158 0.96 0.96 0.95 1.00 1.05 0.3 -0.007 -0.004 -0.016 0.158 0.187 0.161 0.96 0.96 0.94 0.98 1.16 0.5 -0.007 -0.005 -0.012 0.158 0.212 0.167 0.96 0.96 0.94 0.95 1.27 116 Table B.39: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.6 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.009 0.009 0.009 0.009 0.010 0.008 0.3 0.009 0.012 0.012 0.009 0.013 0.009 0.5 0.009 0.015 0.014 0.009 0.016 0.009 0.1 0.004 0.005 0.005 0.004 0.005 0.004 0.3 0.004 0.006 0.005 0.004 0.006 0.004 0.5 0.004 0.007 0.007 0.004 0.008 0.005 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.001 0.001 0.001 0.001 0.001 0.5 0.001 0.002 0.002 0.001 0.002 0.001 Table B.40: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.6 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.007 -0.006 -0.006 0.094 0.098 0.092 0.92 0.94 0.91 1.03 1.07 0.3 -0.007 -0.006 -0.003 0.094 0.113 0.093 0.92 0.94 0.87 1.01 1.21 0.5 -0.007 -0.007 -0.002 0.094 0.126 0.097 0.92 0.93 0.88 0.98 1.30 0.1 -0.004 -0.007 -0.005 0.066 0.070 0.065 0.94 0.95 0.94 1.01 1.07 0.3 -0.004 -0.006 -0.004 0.066 0.078 0.066 0.94 0.95 0.90 1.00 1.18 0.5 -0.004 -0.009 -0.005 0.066 0.089 0.069 0.94 0.94 0.88 0.95 1.29 0.1 -0.000 0.000 0.000 0.028 0.030 0.028 0.95 0.95 0.93 1.00 1.06 0.3 -0.000 0.000 0.000 0.028 0.033 0.030 0.95 0.94 0.91 0.94 1.11 0.5 -0.000 -0.000 -0.000 0.028 0.039 0.030 0.95 0.93 0.87 0.94 1.29 117 Table B.41: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.8 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.170 0.196 0.191 0.204 0.222 0.204 0.3 0.170 0.263 0.249 0.204 0.293 0.200 0.5 0.170 0.323 0.286 0.204 0.359 0.195 0.1 0.083 0.092 0.091 0.100 0.111 0.100 0.3 0.083 0.118 0.110 0.100 0.139 0.099 0.5 0.083 0.168 0.152 0.100 0.180 0.097 0.1 0.019 0.021 0.020 0.020 0.022 0.020 0.3 0.019 0.027 0.026 0.020 0.027 0.019 0.5 0.019 0.031 0.028 0.020 0.035 0.019 Table B.42: Bias, RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) when ρ = 0.8 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.010 -0.023 -0.021 0.452 0.472 0.452 0.96 0.96 0.95 1.00 1.04 0.3 -0.010 -0.022 -0.024 0.452 0.542 0.448 0.96 0.95 0.92 1.01 1.21 0.5 -0.010 -0.023 -0.032 0.452 0.599 0.442 0.96 0.95 0.88 1.02 1.35 0.1 0.001 0.007 0.003 0.316 0.333 0.316 0.97 0.96 0.96 1.00 1.05 0.3 0.001 0.007 0.002 0.316 0.373 0.314 0.97 0.96 0.93 1.01 1.19 0.5 0.001 0.015 0.007 0.316 0.424 0.311 0.97 0.96 0.88 1.02 1.36 0.1 -0.003 0.000 -0.000 0.140 0.148 0.140 0.95 0.95 0.95 1.00 1.05 0.3 -0.003 0.001 0.002 0.140 0.165 0.139 0.95 0.95 0.92 1.01 1.18 0.5 -0.003 -0.006 -0.003 0.140 0.187 0.138 0.95 0.96 0.89 1.02 1.35 118 Table B.43: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.8 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.002 0.002 0.002 0.002 0.002 0.002 0.3 0.002 0.003 0.003 0.002 0.003 0.002 0.5 0.002 0.004 0.003 0.002 0.004 0.002 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.001 0.001 0.001 0.001 0.001 0.5 0.001 0.002 0.002 0.001 0.002 0.001 0.1 0.000 0.000 0.000 0.000 0.000 0.000 0.3 0.000 0.000 0.000 0.000 0.000 0.000 0.5 0.000 0.000 0.000 0.000 0.000 0.000 Table B.44: Bias, RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) when ρ = 0.8 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 0.002 0.002 0.044 0.046 0.045 0.94 0.94 0.94 0.97 1.02 0.3 0.001 0.004 0.004 0.044 0.053 0.046 0.94 0.93 0.89 0.95 1.15 0.5 0.001 0.003 0.004 0.044 0.059 0.046 0.94 0.93 0.87 0.95 1.29 0.1 0.000 -0.001 -0.000 0.030 0.032 0.032 0.93 0.94 0.93 0.95 1.00 0.3 0.000 -0.000 0.001 0.030 0.036 0.032 0.93 0.94 0.92 0.95 1.14 0.5 0.000 -0.001 0.001 0.030 0.041 0.032 0.93 0.95 0.89 0.95 1.31 0.1 -0.000 0.000 0.000 0.014 0.014 0.014 0.96 0.96 0.95 1.00 1.00 0.3 -0.000 0.000 -0.000 0.014 0.017 0.014 0.96 0.95 0.92 1.00 1.23 0.5 -0.000 0.000 0.000 0.014 0.017 0.014 0.96 0.94 0.90 1.00 1.23 119 Table B.45: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.8 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.950 2.183 2.009 1.984 2.151 1.941 0.3 1.950 2.738 2.134 1.984 2.807 1.933 0.5 1.950 3.591 2.331 1.984 3.419 1.936 0.1 0.924 1.025 0.922 0.996 1.100 0.987 0.3 0.924 1.281 0.998 0.996 1.380 0.986 0.5 0.924 1.717 1.069 0.996 1.756 0.990 0.1 0.209 0.228 0.211 0.200 0.222 0.200 0.3 0.209 0.285 0.229 0.200 0.277 0.200 0.5 0.209 0.375 0.246 0.200 0.356 0.200 Table B.46: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.8 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.095 0.109 0.111 1.412 1.471 1.398 0.95 0.94 0.93 1.01 1.05 0.3 0.095 0.126 0.108 1.412 1.680 1.395 0.95 0.96 0.92 1.01 1.20 0.5 0.095 0.090 0.122 1.412 1.851 1.397 0.95 0.94 0.92 1.01 1.33 0.1 0.066 0.056 0.064 1.000 1.050 0.996 0.96 0.95 0.94 1.00 1.06 0.3 0.066 0.068 0.049 1.000 1.177 0.994 0.96 0.95 0.94 1.01 1.18 0.5 0.066 0.052 0.041 1.000 1.326 0.996 0.96 0.94 0.94 1.00 1.33 0.1 0.026 0.012 0.019 0.448 0.471 0.447 0.93 0.94 0.92 1.00 1.05 0.3 0.026 0.002 0.016 0.448 0.526 0.447 0.93 0.94 0.93 1.00 1.18 0.5 0.026 0.006 0.012 0.448 0.596 0.447 0.93 0.94 0.92 1.00 1.33 120 Table B.47: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.8 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.463 0.511 0.475 0.501 0.545 0.492 0.3 0.463 0.670 0.512 0.501 0.714 0.492 0.5 0.463 0.853 0.538 0.501 0.865 0.488 0.1 0.213 0.233 0.218 0.250 0.275 0.247 0.3 0.213 0.295 0.232 0.250 0.345 0.247 0.5 0.213 0.427 0.266 0.250 0.440 0.248 0.1 0.052 0.056 0.053 0.050 0.056 0.050 0.3 0.052 0.074 0.057 0.050 0.069 0.050 0.5 0.052 0.095 0.060 0.050 0.089 0.050 Table B.48: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.8 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.028 0.024 0.029 0.708 0.738 0.702 0.95 0.96 0.95 1.01 1.05 0.3 0.028 0.035 0.030 0.708 0.846 0.702 0.95 0.95 0.94 1.01 1.21 0.5 0.028 0.024 0.028 0.708 0.930 0.699 0.95 0.96 0.93 1.01 1.33 0.1 0.027 0.026 0.026 0.500 0.525 0.498 0.96 0.97 0.96 1.01 1.06 0.3 0.027 0.030 0.023 0.500 0.588 0.498 0.96 0.96 0.95 1.01 1.18 0.5 0.027 0.030 0.024 0.500 0.664 0.499 0.96 0.95 0.95 1.00 1.33 0.1 0.007 0.006 0.008 0.224 0.236 0.224 0.94 0.94 0.94 1.00 1.05 0.3 0.007 0.002 0.008 0.224 0.263 0.223 0.94 0.95 0.92 1.00 1.18 0.5 0.007 -0.003 0.003 0.224 0.298 0.224 0.94 0.94 0.91 1.00 1.33 121 Table B.49: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.8 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.082 1.139 1.091 1.039 1.131 0.973 0.3 1.082 1.531 1.155 1.039 1.501 1.007 0.5 1.082 1.942 1.296 1.039 1.857 1.062 0.1 0.533 0.579 0.542 0.510 0.564 0.495 0.3 0.533 0.783 0.594 0.510 0.712 0.511 0.5 0.533 0.850 0.665 0.510 0.915 0.551 0.1 0.105 0.117 0.108 0.100 0.111 0.100 0.3 0.105 0.147 0.119 0.100 0.139 0.104 0.5 0.105 0.184 0.131 0.100 0.179 0.111 Table B.50: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.8 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.092 -0.092 -0.202 1.023 1.068 1.007 0.93 0.93 0.90 1.02 1.06 0.3 -0.092 -0.117 -0.222 1.023 1.231 1.028 0.93 0.94 0.90 1.00 1.20 0.5 -0.092 -0.133 -0.235 1.023 1.369 1.057 0.93 0.92 0.89 0.97 1.30 0.1 -0.046 -0.066 -0.093 0.715 0.754 0.710 0.94 0.93 0.93 1.01 1.06 0.3 -0.046 -0.057 -0.103 0.715 0.846 0.722 0.94 0.94 0.92 0.99 1.17 0.5 -0.046 -0.095 -0.101 0.715 0.961 0.749 0.94 0.95 0.92 0.96 1.28 0.1 -0.008 -0.006 -0.019 0.317 0.334 0.317 0.94 0.95 0.94 1.00 1.05 0.3 -0.008 -0.002 -0.018 0.317 0.373 0.322 0.94 0.95 0.93 0.98 1.16 0.5 -0.008 -0.007 -0.024 0.317 0.423 0.334 0.94 0.94 0.93 0.95 1.27 122 Table B.51: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.8 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.274 0.296 0.279 0.262 0.286 0.247 0.3 0.274 0.382 0.305 0.262 0.382 0.256 0.5 0.274 0.468 0.320 0.262 0.470 0.269 0.1 0.127 0.134 0.127 0.128 0.141 0.124 0.3 0.127 0.182 0.145 0.128 0.178 0.128 0.5 0.127 0.213 0.152 0.128 0.229 0.138 0.1 0.025 0.028 0.025 0.025 0.028 0.025 0.3 0.025 0.034 0.028 0.025 0.035 0.026 0.5 0.025 0.043 0.031 0.025 0.045 0.028 Table B.52: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.8 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.021 -0.016 -0.069 0.513 0.535 0.501 0.93 0.93 0.92 1.02 1.07 0.3 -0.021 -0.018 -0.073 0.513 0.618 0.512 0.93 0.94 0.90 1.00 1.21 0.5 -0.021 -0.032 -0.093 0.513 0.686 0.527 0.93 0.92 0.90 0.97 1.30 0.1 -0.016 -0.029 -0.042 0.358 0.377 0.355 0.94 0.94 0.93 1.01 1.06 0.3 -0.016 -0.027 -0.043 0.358 0.423 0.361 0.94 0.92 0.92 0.99 1.17 0.5 -0.016 -0.046 -0.044 0.358 0.481 0.375 0.94 0.95 0.94 0.95 1.28 0.1 -0.006 -0.001 -0.008 0.159 0.167 0.158 0.96 0.95 0.95 1.00 1.06 0.3 -0.006 -0.003 -0.013 0.159 0.187 0.162 0.96 0.95 0.95 0.98 1.16 0.5 -0.006 -0.005 -0.011 0.159 0.212 0.167 0.96 0.95 0.93 0.95 1.26 123 Table B.53: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.8 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.003 0.003 0.003 0.003 0.003 0.003 0.3 0.003 0.004 0.004 0.003 0.004 0.003 0.5 0.003 0.005 0.004 0.003 0.006 0.003 0.1 0.001 0.002 0.002 0.001 0.002 0.001 0.3 0.001 0.002 0.002 0.001 0.002 0.001 0.5 0.001 0.002 0.002 0.001 0.003 0.002 0.1 0.000 0.000 0.000 0.000 0.000 0.000 0.3 0.000 0.000 0.000 0.000 0.000 0.000 0.5 0.000 0.001 0.000 0.000 0.001 0.000 Table B.54: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.8 over sample sizes 50, 100, and 500 and missing proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.005 -0.004 -0.004 0.055 0.057 0.053 0.94 0.94 0.93 1.04 1.07 0.3 -0.005 -0.005 -0.002 0.055 0.067 0.054 0.94 0.94 0.90 1.02 1.23 0.5 -0.005 -0.005 -0.001 0.055 0.075 0.055 0.94 0.95 0.89 1.00 1.37 0.1 -0.003 -0.004 -0.003 0.038 0.040 0.038 0.94 0.94 0.93 1.00 1.07 0.3 -0.003 -0.004 -0.002 0.038 0.045 0.038 0.94 0.94 0.91 1.00 1.20 0.5 -0.003 -0.006 -0.002 0.038 0.051 0.039 0.94 0.95 0.88 0.97 1.32 0.1 -0.000 -0.000 -0.000 0.017 0.017 0.017 0.94 0.94 0.93 1.00 1.00 0.3 -0.000 0.000 0.000 0.017 0.020 0.017 0.94 0.94 0.90 1.00 1.16 0.5 -0.000 -0.000 0.000 0.017 0.022 0.017 0.94 0.93 0.88 1.00 1.29 124 B.2 Results in the Presence of Right Censored Observations in the Data Figure B.4: Boxplots for estimators of mean of x (µˆx ) and mean of y (µˆy ), when n = 100, ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 Figure B.5: Boxplots for estimators of standard deviation of x (σˆx ) and standard deviation if y (σˆy ), when n = 100, ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 125 Figure B.6: Boxplots for estimators of correlation between x and y (ρ̂), when n = 100, ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 Table B.55: Empirical variance and asymptotic variance of intercept(βˆ0 ) when ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 in the presence of right censored data n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.460 0.470 0.458 0.545 0.570 0.556 0.3 0.460 0.536 0.464 0.545 0.606 0.557 0.5 0.460 0.672 0.505 0.545 0.643 0.572 0.1 0.215 0.218 0.217 0.267 0.280 0.268 0.3 0.215 0.254 0.223 0.267 0.292 0.267 0.5 0.215 0.313 0.234 0.267 0.303 0.272 0.1 0.052 0.051 0.052 0.052 0.054 0.052 0.3 0.052 0.057 0.053 0.052 0.057 0.052 0.5 0.052 0.068 0.057 0.052 0.059 0.053 126 Table B.56: Empirical variance and asymptotic variance of slope(βˆ1 ) when ρ = 0.2 over different missing proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 in the presence of right censored data n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.006 0.006 0.006 0.005 0.006 0.005 0.3 0.006 0.008 0.006 0.005 0.008 0.006 0.5 0.006 0.011 0.006 0.005 0.010 0.006 0.1 0.003 0.003 0.003 0.003 0.003 0.003 0.3 0.003 0.004 0.003 0.003 0.004 0.003 0.5 0.003 0.005 0.003 0.003 0.005 0.003 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.001 0.001 0.001 0.001 0.001 0.5 0.001 0.001 0.001 0.001 0.001 0.001 Table B.57: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.955 2.124 1.965 1.984 2.078 1.971 0.3 1.955 2.211 2.031 1.984 2.128 2.001 0.5 1.955 2.488 2.140 1.984 2.278 1.996 0.1 0.958 1.000 0.967 0.997 0.988 0.986 0.3 0.958 1.078 0.987 0.997 1.055 0.992 0.5 0.958 1.237 1.050 0.997 1.124 0.991 0.1 0.206 0.213 0.204 0.200 0.197 0.199 0.3 0.206 0.238 0.210 0.200 0.212 0.202 0.5 0.206 0.282 0.222 0.200 0.226 0.201 127 Table B.58: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.2 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.099 -0.980 0.131 1.412 1.743 1.410 0.95 0.90 0.94 1.00 1.24 0.3 0.099 -3.029 0.130 1.412 3.362 1.420 0.95 0.46 0.93 0.99 2.37 0.5 0.099 -4.562 0.092 1.412 4.805 1.416 0.95 0.16 0.92 1.00 3.39 0.1 0.064 -1.149 0.061 1.001 1.519 0.995 0.96 0.78 0.96 1.01 1.53 0.3 0.064 -2.927 0.022 1.001 3.102 0.996 0.96 0.19 0.94 1.00 3.11 0.5 0.064 -4.485 -0.020 1.001 4.609 0.996 0.96 0.01 0.93 1.00 4.63 0.1 0.027 -1.209 0.025 0.448 1.288 0.447 0.94 0.22 0.94 1.00 2.88 0.3 0.027 -3.022 0.001 0.448 3.057 0.449 0.94 0.00 0.94 1.00 6.81 0.5 0.027 -4.605 -0.043 0.448 4.629 0.450 0.94 0.00 0.93 0.99 10.28 Table B.59: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.444 0.465 0.444 0.505 0.522 0.504 0.3 0.444 0.542 0.450 0.505 0.551 0.513 0.5 0.444 0.629 0.488 0.505 0.595 0.513 0.1 0.216 0.228 0.218 0.251 0.272 0.249 0.3 0.216 0.256 0.227 0.251 0.270 0.252 0.5 0.216 0.291 0.238 0.251 0.290 0.251 0.1 0.051 0.051 0.051 0.050 0.054 0.050 0.3 0.051 0.059 0.052 0.050 0.053 0.050 0.5 0.051 0.068 0.056 0.050 0.057 0.050 128 Table B.60: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.2 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.005 -0.834 0.028 0.711 1.103 0.711 0.95 0.81 0.96 1.00 1.55 0.3 0.005 -1.568 0.026 0.711 1.735 0.716 0.95 0.44 0.96 0.99 2.42 0.5 0.005 -2.362 0.013 0.711 2.485 0.716 0.95 0.11 0.94 0.99 3.47 0.1 0.015 -0.596 0.016 0.501 0.792 0.499 0.97 0.80 0.96 1.00 1.59 0.3 0.015 -1.491 0.007 0.501 1.579 0.502 0.97 0.16 0.96 1.00 3.15 0.5 0.015 -2.276 -0.016 0.501 2.339 0.502 0.97 0.01 0.95 1.00 4.66 0.1 0.001 -0.619 -0.000 0.223 0.661 0.223 0.93 0.23 0.94 1.00 2.97 0.3 0.001 -1.523 -0.011 0.223 1.540 0.225 0.93 0.00 0.94 0.99 6.85 0.5 0.001 -2.307 -0.034 0.223 2.319 0.227 0.93 0.00 0.93 0.99 10.23 Table B.61: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.104 1.072 1.144 1.023 1.076 0.986 0.3 1.104 1.014 1.304 1.023 1.111 1.000 0.5 1.104 1.162 1.506 1.023 1.204 0.998 0.1 0.551 0.523 0.587 0.506 0.502 0.493 0.3 0.551 0.545 0.661 0.506 0.539 0.496 0.5 0.551 0.600 0.759 0.506 0.577 0.495 0.1 0.102 0.100 0.110 0.100 0.099 0.100 0.3 0.102 0.108 0.133 0.100 0.107 0.101 0.5 0.102 0.123 0.152 0.100 0.113 0.101 129 Table B.62: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.2 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.096 -0.906 -0.129 1.016 1.377 1.001 0.94 0.80 0.93 1.02 1.38 0.3 -0.096 -2.086 -0.163 1.016 2.337 1.013 0.94 0.47 0.90 1.00 2.31 0.5 -0.096 -2.745 -0.234 1.016 2.956 1.026 0.94 0.32 0.86 0.99 2.88 0.1 -0.043 -1.014 -0.101 0.713 1.237 0.709 0.94 0.66 0.91 1.00 1.74 0.3 -0.043 -1.998 -0.170 0.713 2.129 0.725 0.94 0.26 0.89 0.98 2.94 0.5 -0.043 -2.650 -0.231 0.713 2.757 0.741 0.94 0.10 0.86 0.96 3.72 0.1 -0.008 -0.995 -0.023 0.317 1.044 0.316 0.95 0.14 0.94 1.00 3.30 0.3 -0.008 -1.948 -0.066 0.317 1.976 0.324 0.95 0.00 0.91 0.98 6.09 0.5 -0.008 -2.599 -0.129 0.317 2.621 0.342 0.95 0.00 0.86 0.93 7.66 Table B.63: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.257 0.235 0.280 0.261 0.270 0.252 0.3 0.257 0.254 0.318 0.261 0.288 0.256 0.5 0.257 0.291 0.373 0.261 0.315 0.256 0.1 0.126 0.113 0.132 0.127 0.138 0.124 0.3 0.126 0.117 0.147 0.127 0.138 0.126 0.5 0.126 0.143 0.163 0.127 0.149 0.126 0.1 0.025 0.022 0.026 0.025 0.027 0.025 0.3 0.025 0.023 0.028 0.025 0.027 0.025 0.5 0.025 0.027 0.034 0.025 0.028 0.025 130 Table B.64: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.2 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.000 -0.635 -0.006 0.511 0.821 0.502 0.94 0.73 0.93 1.02 1.63 0.3 0.000 -0.973 -0.020 0.511 1.111 0.507 0.94 0.52 0.90 1.01 2.19 0.5 0.000 -1.291 -0.048 0.511 1.408 0.509 0.94 0.39 0.88 1.00 2.77 0.1 -0.005 -0.481 -0.026 0.357 0.608 0.354 0.94 0.73 0.93 1.01 1.72 0.3 -0.005 -0.951 -0.044 0.357 1.021 0.358 0.94 0.29 0.91 1.00 2.86 0.5 -0.005 -1.270 -0.077 0.357 1.327 0.363 0.94 0.14 0.89 0.98 3.66 0.1 -0.008 -0.502 -0.015 0.158 0.528 0.159 0.95 0.14 0.96 1.00 3.33 0.3 -0.008 -0.974 -0.034 0.158 0.987 0.162 0.95 0.00 0.92 0.98 6.09 0.5 -0.008 -1.295 -0.067 0.158 1.306 0.172 0.95 0.00 0.88 0.92 7.59 Table B.65: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.2 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.020 0.022 0.020 0.019 0.022 0.018 0.3 0.020 0.030 0.020 0.019 0.027 0.018 0.5 0.020 0.039 0.020 0.019 0.034 0.018 0.1 0.010 0.011 0.010 0.009 0.011 0.009 0.3 0.010 0.013 0.010 0.009 0.013 0.009 0.5 0.010 0.018 0.010 0.009 0.017 0.009 0.1 0.002 0.002 0.002 0.002 0.002 0.002 0.3 0.002 0.003 0.002 0.002 0.003 0.002 0.5 0.002 0.004 0.002 0.002 0.003 0.002 131 Table B.66: Bias, RMSE, 95% CP, and relative RMSE of correlation y (ρ̂) when ρ = 0.2 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.005 -0.040 -0.006 0.137 0.152 0.134 0.93 0.93 0.91 1.02 1.14 0.3 -0.005 -0.069 -0.011 0.137 0.178 0.134 0.93 0.92 0.92 1.02 1.32 0.5 -0.005 -0.086 -0.018 0.137 0.204 0.135 0.93 0.90 0.92 1.01 1.51 0.1 -0.004 -0.038 -0.005 0.097 0.109 0.096 0.93 0.92 0.93 1.01 1.14 0.3 -0.004 -0.061 -0.009 0.097 0.130 0.096 0.93 0.92 0.93 1.01 1.35 0.5 -0.004 -0.074 -0.015 0.097 0.149 0.097 0.93 0.90 0.93 0.99 1.53 0.1 0.000 -0.034 -0.001 0.042 0.057 0.042 0.95 0.89 0.95 1.00 1.35 0.3 0.000 -0.064 -0.005 0.042 0.082 0.043 0.95 0.79 0.96 0.99 1.91 0.5 0.000 -0.083 -0.011 0.042 0.101 0.045 0.95 0.71 0.95 0.95 2.25 Table B.67: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.4 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.400 0.401 0.400 0.476 0.508 0.486 0.3 0.400 0.460 0.401 0.476 0.509 0.493 0.5 0.400 0.593 0.437 0.476 0.594 0.495 0.1 0.189 0.195 0.191 0.234 0.249 0.234 0.3 0.189 0.222 0.194 0.234 0.245 0.236 0.5 0.189 0.277 0.204 0.234 0.280 0.236 0.1 0.045 0.044 0.045 0.046 0.049 0.046 0.3 0.045 0.050 0.046 0.046 0.048 0.046 0.5 0.045 0.058 0.049 0.046 0.055 0.046 132 Table B.68: Bias, RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) when ρ = 0.4 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.003 -0.418 0.014 0.690 0.827 0.697 0.96 0.93 0.97 0.99 1.19 0.3 -0.003 -1.037 0.016 0.690 1.259 0.703 0.96 0.70 0.96 0.98 1.79 0.5 -0.003 -1.640 0.020 0.690 1.812 0.704 0.96 0.42 0.95 0.98 2.57 0.1 0.010 -0.346 0.013 0.483 0.608 0.484 0.97 0.91 0.97 1.00 1.26 0.3 0.010 -0.971 0.012 0.483 1.090 0.486 0.97 0.50 0.96 0.99 2.24 0.5 0.010 -1.565 0.006 0.483 1.652 0.486 0.97 0.15 0.97 1.00 3.40 0.1 -0.003 -0.363 -0.002 0.214 0.425 0.214 0.95 0.65 0.95 1.00 1.98 0.3 -0.003 -0.995 -0.003 0.214 1.018 0.215 0.95 0.00 0.94 1.00 4.73 0.5 -0.003 -1.602 -0.014 0.214 1.619 0.215 0.95 0.00 0.93 1.00 7.52 Table B.69: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.4 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.005 0.006 0.005 0.005 0.006 0.005 0.3 0.005 0.008 0.005 0.005 0.007 0.005 0.5 0.005 0.011 0.006 0.005 0.009 0.005 0.1 0.002 0.003 0.002 0.002 0.003 0.002 0.3 0.002 0.004 0.002 0.002 0.003 0.002 0.5 0.002 0.005 0.003 0.002 0.004 0.002 0.1 0.000 0.000 0.000 0.000 0.001 0.000 0.3 0.000 0.001 0.000 0.000 0.001 0.000 0.5 0.000 0.001 0.000 0.000 0.001 0.001 133 Table B.70: Bias, RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) when ρ = 0.4 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 -0.030 0.000 0.068 0.080 0.069 0.93 0.90 0.92 0.99 1.17 0.3 0.001 -0.053 -0.005 0.068 0.099 0.070 0.93 0.87 0.92 0.98 1.43 0.5 0.001 -0.069 -0.010 0.068 0.118 0.072 0.93 0.84 0.92 0.94 1.64 0.1 -0.001 -0.028 -0.002 0.047 0.058 0.047 0.94 0.90 0.93 1.00 1.23 0.3 -0.001 -0.051 -0.005 0.047 0.076 0.048 0.94 0.83 0.94 0.97 1.58 0.5 -0.001 -0.065 -0.010 0.047 0.092 0.050 0.94 0.81 0.92 0.94 1.83 0.1 0.000 -0.028 -0.001 0.020 0.036 0.020 0.96 0.78 0.96 1.00 1.79 0.3 0.000 -0.054 -0.005 0.020 0.059 0.021 0.96 0.43 0.95 0.97 2.86 0.5 0.000 -0.071 -0.010 0.020 0.076 0.025 0.96 0.27 0.92 0.82 3.10 Table B.71: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.4 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.955 2.084 1.982 1.984 2.110 1.959 0.3 1.955 2.209 2.044 1.984 2.081 1.976 0.5 1.955 2.414 2.125 1.984 2.194 1.947 0.1 0.948 0.985 0.956 0.997 1.064 0.982 0.3 0.948 1.076 0.983 0.997 1.036 0.980 0.5 0.948 1.167 1.025 0.997 1.088 0.967 0.1 0.207 0.211 0.205 0.200 0.213 0.199 0.3 0.207 0.231 0.210 0.200 0.207 0.199 0.5 0.207 0.271 0.220 0.200 0.218 0.196 134 Table B.72: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.4 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.099 -1.420 0.119 1.412 2.032 1.405 0.94 0.84 0.94 1.01 1.45 0.3 0.099 -3.256 0.097 1.412 3.562 1.409 0.94 0.39 0.94 1.00 2.53 0.5 0.099 -4.876 0.018 1.412 5.096 1.395 0.94 0.10 0.93 1.01 3.65 0.1 0.065 -1.268 0.054 1.000 1.635 0.992 0.96 0.77 0.96 1.01 1.65 0.3 0.065 -3.160 -0.008 1.000 3.320 0.990 0.96 0.12 0.95 1.01 3.35 0.5 0.065 -4.777 -0.091 1.000 4.890 0.988 0.96 0.01 0.93 1.01 4.95 0.1 0.027 -1.326 0.017 0.448 1.404 0.446 0.94 0.17 0.94 1.00 3.15 0.3 0.027 -3.246 -0.033 0.448 3.278 0.447 0.94 0.00 0.94 1.00 7.33 0.5 0.027 -4.884 -0.117 0.448 4.907 0.458 0.94 0.00 0.92 0.98 10.71 Table B.73: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.4 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.448 0.468 0.448 0.504 0.541 0.502 0.3 0.448 0.532 0.458 0.504 0.536 0.504 0.5 0.448 0.619 0.492 0.504 0.572 0.500 0.1 0.214 0.224 0.216 0.251 0.269 0.248 0.3 0.214 0.245 0.225 0.251 0.264 0.248 0.5 0.214 0.276 0.236 0.251 0.279 0.245 0.1 0.051 0.051 0.051 0.050 0.053 0.050 0.3 0.051 0.059 0.053 0.050 0.052 0.050 0.5 0.051 0.066 0.056 0.050 0.055 0.049 135 Table B.74: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.4 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.012 -0.754 0.031 0.710 1.053 0.709 0.95 0.84 0.95 1.00 1.49 0.3 0.012 -1.678 0.011 0.710 1.831 0.710 0.95 0.36 0.95 1.00 2.58 0.5 0.012 -2.489 -0.017 0.710 2.602 0.707 0.95 0.08 0.93 1.00 3.68 0.1 0.019 -0.651 0.017 0.501 0.832 0.498 0.97 0.76 0.96 1.01 1.67 0.3 0.019 -1.598 -0.009 0.501 1.678 0.498 0.97 0.11 0.95 1.01 3.37 0.5 0.019 -2.413 -0.047 0.501 2.471 0.498 0.97 0.01 0.95 1.01 4.97 0.1 0.003 -0.675 -0.002 0.223 0.714 0.223 0.93 0.15 0.93 1.00 3.20 0.3 0.003 -1.632 -0.025 0.223 1.647 0.225 0.93 0.00 0.93 1.00 7.34 0.5 0.003 -2.447 -0.067 0.223 2.458 0.232 0.93 0.00 0.91 0.97 10.62 Table B.75: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.4 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.096 0.984 1.154 1.025 1.094 0.980 0.3 1.096 0.997 1.271 1.025 1.087 0.988 0.5 1.096 1.092 1.425 1.025 1.159 0.974 0.1 0.546 0.506 0.575 0.507 0.541 0.491 0.3 0.546 0.520 0.637 0.507 0.529 0.490 0.5 0.546 0.561 0.714 0.507 0.559 0.484 0.1 0.103 0.100 0.110 0.100 0.107 0.099 0.3 0.103 0.105 0.131 0.100 0.104 0.100 0.5 0.103 0.117 0.147 0.100 0.110 0.098 136 Table B.76: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.4 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.096 -1.198 -0.160 1.017 1.590 1.003 0.94 0.73 0.91 1.01 1.59 0.3 -0.096 -2.090 -0.222 1.017 2.336 1.019 0.94 0.46 0.88 1.00 2.29 0.5 -0.096 -2.725 -0.352 1.017 2.929 1.048 0.94 0.33 0.84 0.97 2.80 0.1 -0.044 -1.051 -0.120 0.713 1.283 0.711 0.93 0.67 0.91 1.00 1.80 0.3 -0.044 -1.993 -0.230 0.713 2.121 0.737 0.93 0.26 0.88 0.97 2.88 0.5 -0.044 -2.626 -0.347 0.713 2.730 0.777 0.93 0.09 0.84 0.92 3.51 0.1 -0.008 -1.029 -0.041 0.317 1.080 0.318 0.95 0.13 0.93 1.00 3.40 0.3 -0.008 -1.963 -0.128 0.317 1.989 0.340 0.95 0.00 0.88 0.93 5.84 0.5 -0.008 -2.583 -0.247 0.317 2.604 0.399 0.95 0.00 0.80 0.79 6.53 Table B.77: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.4 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.264 0.246 0.275 0.261 0.280 0.251 0.3 0.264 0.252 0.319 0.261 0.280 0.252 0.5 0.264 0.278 0.349 0.261 0.302 0.250 0.1 0.127 0.110 0.129 0.127 0.137 0.124 0.3 0.127 0.115 0.145 0.127 0.135 0.124 0.5 0.127 0.132 0.157 0.127 0.143 0.123 0.1 0.024 0.022 0.026 0.025 0.027 0.025 0.3 0.024 0.023 0.029 0.025 0.026 0.025 0.5 0.024 0.026 0.033 0.025 0.027 0.025 137 Table B.78: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.4 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.004 -0.544 -0.018 0.511 0.759 0.501 0.93 0.77 0.92 1.02 1.51 0.3 -0.004 -0.984 -0.063 0.511 1.118 0.506 0.93 0.52 0.89 1.01 2.21 0.5 -0.004 -1.285 -0.111 0.511 1.398 0.512 0.93 0.37 0.87 1.00 2.73 0.1 -0.008 -0.501 -0.038 0.357 0.622 0.354 0.94 0.72 0.93 1.01 1.76 0.3 -0.008 -0.959 -0.083 0.357 1.027 0.362 0.94 0.28 0.91 0.99 2.84 0.5 -0.008 -1.263 -0.136 0.357 1.319 0.376 0.94 0.13 0.87 0.95 3.51 0.1 -0.007 -0.517 -0.024 0.158 0.543 0.159 0.96 0.11 0.96 0.99 3.41 0.3 -0.007 -0.979 -0.063 0.158 0.993 0.170 0.96 0.00 0.89 0.93 5.84 0.5 -0.007 -1.286 -0.123 0.158 1.297 0.199 0.96 0.00 0.84 0.80 6.51 Table B.79: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.4 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.015 0.020 0.015 0.015 0.018 0.014 0.3 0.015 0.026 0.016 0.015 0.023 0.014 0.5 0.015 0.035 0.016 0.015 0.030 0.015 0.1 0.007 0.009 0.007 0.007 0.009 0.007 0.3 0.007 0.012 0.008 0.007 0.011 0.007 0.5 0.007 0.016 0.008 0.007 0.014 0.007 0.1 0.002 0.002 0.002 0.001 0.002 0.001 0.3 0.002 0.002 0.002 0.001 0.002 0.001 0.5 0.002 0.003 0.002 0.001 0.003 0.002 138 Table B.80: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.4 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.007 -0.070 -0.010 0.121 0.152 0.119 0.93 0.91 0.92 1.02 1.28 0.3 -0.007 -0.117 -0.019 0.121 0.192 0.121 0.93 0.90 0.93 1.00 1.59 0.5 -0.007 -0.151 -0.030 0.121 0.230 0.124 0.93 0.86 0.91 0.98 1.85 0.1 -0.005 -0.061 -0.007 0.085 0.111 0.085 0.93 0.91 0.93 1.00 1.32 0.3 -0.005 -0.108 -0.015 0.085 0.152 0.086 0.93 0.85 0.94 0.99 1.76 0.5 -0.005 -0.136 -0.026 0.085 0.182 0.089 0.93 0.80 0.92 0.95 2.04 0.1 0.000 -0.056 -0.002 0.037 0.069 0.038 0.95 0.75 0.95 1.00 1.85 0.3 0.000 -0.108 -0.010 0.037 0.118 0.039 0.95 0.38 0.95 0.96 3.03 0.5 0.000 -0.142 -0.020 0.037 0.152 0.044 0.95 0.23 0.92 0.85 3.46 Table B.81: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.6 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.304 0.315 0.303 0.363 0.351 0.370 0.3 0.304 0.350 0.305 0.363 0.359 0.377 0.5 0.304 0.449 0.327 0.363 0.418 0.380 0.1 0.145 0.151 0.146 0.178 0.172 0.178 0.3 0.145 0.167 0.152 0.178 0.173 0.180 0.5 0.145 0.209 0.154 0.178 0.198 0.181 0.1 0.034 0.034 0.034 0.035 0.033 0.035 0.3 0.034 0.038 0.035 0.035 0.034 0.035 0.5 0.034 0.044 0.035 0.035 0.039 0.035 139 Table B.82: Bias, RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.008 -0.280 0.005 0.602 0.655 0.608 0.97 0.93 0.96 0.99 1.08 0.3 -0.008 -0.741 0.012 0.602 0.953 0.614 0.97 0.77 0.96 0.98 1.55 0.5 -0.008 -1.199 0.012 0.602 1.362 0.617 0.97 0.54 0.96 0.98 2.21 0.1 0.006 -0.226 0.009 0.422 0.472 0.422 0.97 0.92 0.97 1.00 1.12 0.3 0.006 -0.680 0.011 0.422 0.797 0.425 0.97 0.61 0.96 0.99 1.88 0.5 0.006 -1.148 0.011 0.422 1.231 0.425 0.97 0.27 0.96 0.99 2.90 0.1 -0.003 -0.238 -0.001 0.187 0.300 0.187 0.94 0.75 0.95 1.00 1.61 0.3 -0.003 -0.701 -0.001 0.187 0.724 0.188 0.94 0.05 0.95 0.99 3.85 0.5 -0.003 -1.175 -0.007 0.187 1.191 0.188 0.94 0.00 0.94 1.00 6.35 Table B.83: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.6 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.004 0.005 0.004 0.004 0.004 0.004 0.3 0.004 0.007 0.004 0.004 0.006 0.004 0.5 0.004 0.009 0.005 0.004 0.008 0.004 0.1 0.002 0.002 0.002 0.002 0.002 0.002 0.3 0.002 0.003 0.002 0.002 0.003 0.002 0.5 0.002 0.004 0.002 0.002 0.003 0.002 0.1 0.000 0.000 0.000 0.000 0.000 0.000 0.3 0.000 0.001 0.000 0.000 0.001 0.000 0.5 0.000 0.001 0.000 0.000 0.001 0.000 140 Table B.84: Bias, RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 -0.031 0.000 0.059 0.073 0.060 0.94 0.89 0.92 0.99 1.22 0.3 0.001 -0.059 -0.006 0.059 0.096 0.061 0.94 0.85 0.92 0.97 1.57 0.5 0.001 -0.078 -0.013 0.059 0.117 0.065 0.94 0.80 0.90 0.92 1.81 0.1 -0.000 -0.030 -0.002 0.041 0.054 0.041 0.93 0.87 0.93 1.00 1.31 0.3 -0.000 -0.058 -0.006 0.041 0.078 0.043 0.93 0.78 0.92 0.96 1.82 0.5 -0.000 -0.077 -0.013 0.041 0.096 0.046 0.93 0.71 0.91 0.91 2.12 0.1 0.000 -0.030 -0.002 0.017 0.036 0.017 0.96 0.67 0.95 0.99 2.07 0.3 0.000 -0.059 -0.007 0.017 0.063 0.019 0.96 0.27 0.92 0.94 3.43 0.5 0.000 -0.081 -0.013 0.017 0.084 0.024 0.96 0.14 0.88 0.72 3.50 Table B.85: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.6 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.954 2.087 1.979 1.984 1.802 1.947 0.3 1.954 2.222 2.053 1.984 1.744 1.943 0.5 1.954 2.345 2.083 1.984 1.772 1.900 0.1 0.937 0.985 0.945 0.996 0.911 0.976 0.3 0.937 1.055 0.974 0.996 0.869 0.966 0.5 0.937 1.146 0.999 0.996 0.880 0.946 0.1 0.208 0.214 0.207 0.200 0.182 0.198 0.3 0.208 0.233 0.213 0.200 0.174 0.196 0.5 0.208 0.261 0.216 0.200 0.176 0.192 141 Table B.86: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.097 -1.517 0.103 1.412 2.026 1.399 0.94 0.79 0.93 1.01 1.45 0.3 0.097 -3.400 0.049 1.412 3.648 1.395 0.94 0.29 0.93 1.01 2.62 0.5 0.097 -5.032 -0.042 1.412 5.205 1.379 0.94 0.05 0.93 1.02 3.77 0.1 0.066 -1.342 0.043 1.000 1.647 0.989 0.95 0.70 0.96 1.01 1.67 0.3 0.066 -3.296 -0.044 1.000 3.425 0.984 0.95 0.08 0.95 1.02 3.48 0.5 0.066 -4.951 -0.158 1.000 5.039 0.985 0.95 0.00 0.93 1.02 5.11 0.1 0.026 -1.398 0.005 0.448 1.462 0.444 0.94 0.10 0.94 1.01 3.29 0.3 0.026 -3.363 -0.071 0.448 3.389 0.449 0.94 0.00 0.93 1.00 7.55 0.5 0.026 -5.031 -0.188 0.448 5.049 0.477 0.94 0.00 0.90 0.94 10.59 Table B.87: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.6 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.454 0.482 0.456 0.503 0.462 0.497 0.3 0.454 0.533 0.471 0.503 0.447 0.495 0.5 0.454 0.608 0.491 0.503 0.460 0.486 0.1 0.212 0.221 0.213 0.250 0.230 0.246 0.3 0.212 0.236 0.227 0.250 0.221 0.244 0.5 0.212 0.262 0.231 0.250 0.225 0.239 0.1 0.052 0.051 0.052 0.050 0.046 0.049 0.3 0.052 0.058 0.054 0.050 0.044 0.049 0.5 0.052 0.064 0.055 0.050 0.044 0.048 142 Table B.88: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.019 -0.784 0.030 0.710 1.038 0.706 0.96 0.80 0.96 1.01 1.47 0.3 0.019 -1.740 -0.000 0.710 1.864 0.704 0.96 0.28 0.95 1.01 2.65 0.5 0.019 -2.553 -0.047 0.710 2.641 0.699 0.96 0.05 0.94 1.02 3.78 0.1 0.023 -0.680 0.015 0.501 0.832 0.496 0.97 0.71 0.96 1.01 1.68 0.3 0.023 -1.654 -0.023 0.501 1.720 0.495 0.97 0.07 0.95 1.01 3.47 0.5 0.023 -2.488 -0.079 0.501 2.533 0.495 0.97 0.00 0.95 1.01 5.11 0.1 0.005 -0.706 -0.005 0.223 0.737 0.222 0.93 0.09 0.93 1.00 3.32 0.3 0.005 -1.689 -0.041 0.223 1.702 0.225 0.93 0.00 0.93 0.99 7.55 0.5 0.005 -2.521 -0.101 0.223 2.530 0.241 0.93 0.00 0.90 0.93 10.48 Table B.89: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.6 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.088 0.943 1.125 1.030 0.938 0.973 0.3 1.088 0.944 1.245 1.030 0.914 0.972 0.5 1.088 1.006 1.348 1.030 0.937 0.950 0.1 0.540 0.495 0.564 0.507 0.465 0.488 0.3 0.540 0.491 0.614 0.507 0.444 0.483 0.5 0.540 0.522 0.679 0.507 0.452 0.473 0.1 0.104 0.099 0.112 0.100 0.092 0.099 0.3 0.104 0.101 0.130 0.100 0.088 0.098 0.5 0.104 0.110 0.145 0.100 0.088 0.096 143 Table B.90: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.095 -1.270 -0.191 1.019 1.597 1.005 0.94 0.68 0.91 1.01 1.59 0.3 -0.095 -2.163 -0.304 1.019 2.364 1.032 0.94 0.40 0.87 0.99 2.29 0.5 -0.095 -2.773 -0.467 1.019 2.937 1.081 0.94 0.24 0.83 0.94 2.72 0.1 -0.045 -1.114 -0.149 0.714 1.306 0.714 0.93 0.61 0.91 1.00 1.83 0.3 -0.045 -2.067 -0.298 0.714 2.172 0.756 0.93 0.18 0.87 0.94 2.87 0.5 -0.045 -2.682 -0.454 0.714 2.765 0.824 0.93 0.06 0.82 0.87 3.36 0.1 -0.008 -1.087 -0.067 0.317 1.129 0.321 0.94 0.07 0.93 0.99 3.51 0.3 -0.008 -2.030 -0.198 0.317 2.051 0.370 0.94 0.00 0.85 0.86 5.54 0.5 -0.008 -2.645 -0.357 0.317 2.662 0.473 0.94 0.00 0.73 0.67 5.63 Table B.91: Empirical variance and asymptotic variance of standard deviation of y (σˆx ) when ρ = 0.6 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.271 0.244 0.277 0.261 0.241 0.249 0.3 0.271 0.243 0.319 0.261 0.234 0.248 0.5 0.271 0.263 0.342 0.261 0.243 0.243 0.1 0.127 0.108 0.127 0.127 0.117 0.123 0.3 0.127 0.109 0.144 0.127 0.113 0.122 0.5 0.127 0.122 0.155 0.127 0.116 0.120 0.1 0.024 0.022 0.025 0.025 0.023 0.025 0.3 0.024 0.023 0.029 0.025 0.022 0.025 0.5 0.024 0.023 0.033 0.025 0.022 0.024 144 Table B.92: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.011 -0.579 -0.043 0.511 0.759 0.500 0.93 0.73 0.91 1.02 1.52 0.3 -0.011 -1.032 -0.106 0.511 1.140 0.509 0.93 0.43 0.88 1.00 2.24 0.5 -0.011 -1.319 -0.180 0.511 1.408 0.525 0.93 0.29 0.86 0.97 2.69 0.1 -0.011 -0.532 -0.055 0.357 0.633 0.355 0.94 0.64 0.92 1.01 1.78 0.3 -0.011 -1.000 -0.120 0.357 1.055 0.370 0.94 0.20 0.89 0.97 2.85 0.5 -0.011 -1.298 -0.196 0.357 1.342 0.398 0.94 0.07 0.84 0.90 3.38 0.1 -0.007 -0.543 -0.034 0.158 0.564 0.161 0.96 0.07 0.94 0.98 3.51 0.3 -0.007 -1.014 -0.098 0.158 1.025 0.184 0.96 0.00 0.85 0.86 5.56 0.5 -0.007 -1.318 -0.179 0.158 1.326 0.236 0.96 0.00 0.74 0.67 5.61 Table B.93: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.6 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.009 0.013 0.009 0.009 0.012 0.009 0.3 0.009 0.018 0.010 0.009 0.017 0.009 0.5 0.009 0.026 0.010 0.009 0.023 0.009 0.1 0.004 0.006 0.004 0.004 0.006 0.004 0.3 0.004 0.008 0.005 0.004 0.008 0.004 0.5 0.004 0.012 0.005 0.004 0.011 0.005 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.002 0.001 0.001 0.002 0.001 0.5 0.001 0.002 0.001 0.001 0.002 0.001 145 Table B.94: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.007 -0.075 -0.012 0.094 0.133 0.093 0.92 0.92 0.93 1.01 1.44 0.3 -0.007 -0.132 -0.024 0.094 0.185 0.097 0.92 0.89 0.92 0.97 1.90 0.5 -0.007 -0.173 -0.040 0.094 0.229 0.105 0.92 0.84 0.95 0.90 2.18 0.1 -0.004 -0.066 -0.008 0.066 0.100 0.065 0.94 0.89 0.94 1.01 1.54 0.3 -0.004 -0.123 -0.019 0.066 0.152 0.069 0.94 0.77 0.93 0.95 2.21 0.5 -0.004 -0.163 -0.033 0.066 0.193 0.076 0.94 0.69 0.93 0.87 2.55 0.1 -0.000 -0.060 -0.004 0.028 0.069 0.029 0.95 0.58 0.95 0.99 2.41 0.3 -0.000 -0.119 -0.014 0.028 0.125 0.033 0.95 0.12 0.93 0.86 3.80 0.5 -0.000 -0.162 -0.027 0.028 0.169 0.041 0.95 0.05 0.88 0.70 4.16 Table B.95: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.8 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.170 0.172 0.170 0.204 0.206 0.208 0.3 0.170 0.197 0.174 0.204 0.219 0.218 0.5 0.170 0.260 0.188 0.204 0.258 0.231 0.1 0.083 0.086 0.083 0.100 0.101 0.101 0.3 0.083 0.095 0.086 0.100 0.105 0.105 0.5 0.083 0.116 0.091 0.100 0.122 0.109 0.1 0.019 0.019 0.019 0.020 0.020 0.020 0.3 0.019 0.021 0.020 0.020 0.020 0.020 0.5 0.019 0.024 0.020 0.020 0.024 0.021 146 Table B.96: Bias, RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) when ρ = 0.8 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.010 -0.136 -0.001 0.452 0.474 0.457 0.96 0.95 0.97 0.99 1.04 0.3 -0.010 -0.380 0.012 0.452 0.603 0.467 0.96 0.89 0.97 0.97 1.29 0.5 -0.010 -0.650 0.032 0.452 0.825 0.482 0.96 0.75 0.97 0.94 1.71 0.1 0.001 -0.103 0.006 0.316 0.334 0.317 0.97 0.96 0.97 1.00 1.05 0.3 0.001 -0.339 0.017 0.316 0.469 0.324 0.97 0.83 0.96 0.98 1.45 0.5 0.001 -0.607 0.027 0.316 0.701 0.332 0.97 0.59 0.97 0.95 2.11 0.1 -0.003 -0.109 0.000 0.140 0.178 0.140 0.95 0.88 0.95 1.00 1.27 0.3 -0.003 -0.353 0.008 0.140 0.381 0.143 0.95 0.32 0.94 0.98 2.66 0.5 -0.003 -0.626 0.020 0.140 0.644 0.148 0.95 0.02 0.94 0.95 4.36 Table B.97: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.8 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.002 0.003 0.002 0.002 0.003 0.002 0.3 0.002 0.004 0.003 0.002 0.004 0.002 0.5 0.002 0.006 0.003 0.002 0.005 0.002 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.002 0.001 0.001 0.002 0.001 0.5 0.001 0.002 0.002 0.001 0.002 0.001 0.1 0.000 0.000 0.000 0.000 0.000 0.000 0.3 0.000 0.000 0.000 0.000 0.000 0.000 0.5 0.000 0.000 0.000 0.000 0.000 0.000 147 Table B.98: Bias, RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) when ρ = 0.8 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 -0.022 -0.000 0.044 0.056 0.045 0.94 0.91 0.92 0.98 1.26 0.3 0.001 -0.045 -0.006 0.044 0.075 0.047 0.94 0.84 0.90 0.92 1.59 0.5 0.001 -0.062 -0.017 0.044 0.093 0.052 0.94 0.82 0.89 0.84 1.78 0.1 0.000 -0.021 -0.002 0.030 0.041 0.032 0.93 0.88 0.93 0.95 1.28 0.3 0.000 -0.042 -0.007 0.030 0.059 0.032 0.93 0.79 0.91 0.93 1.82 0.5 0.000 -0.060 -0.016 0.030 0.076 0.037 0.93 0.72 0.88 0.81 2.07 0.1 -0.000 -0.021 -0.002 0.014 0.025 0.014 0.96 0.74 0.94 0.99 1.77 0.3 -0.000 -0.044 -0.008 0.014 0.047 0.016 0.96 0.32 0.91 0.88 2.94 0.5 -0.000 -0.061 -0.016 0.014 0.065 0.022 0.96 0.13 0.77 0.65 2.98 Table B.99: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.8 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.950 2.087 1.974 1.985 1.760 1.935 0.3 1.950 2.169 2.040 1.985 1.648 1.915 0.5 1.950 2.223 2.058 1.985 1.623 1.879 0.1 0.924 0.952 0.930 0.996 0.891 0.970 0.3 0.924 1.000 0.956 0.996 0.823 0.955 0.5 0.924 1.047 0.976 0.996 0.807 0.936 0.1 0.209 0.216 0.209 0.200 0.178 0.197 0.3 0.209 0.235 0.214 0.200 0.165 0.194 0.5 0.209 0.250 0.219 0.200 0.161 0.190 148 Table B.100: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.8 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.095 -1.512 0.086 1.412 2.011 1.394 0.95 0.77 0.94 1.01 1.44 0.3 0.095 -3.392 0.000 1.412 3.627 1.384 0.95 0.26 0.94 1.02 2.62 0.5 0.095 -5.019 -0.060 1.412 5.178 1.372 0.95 0.04 0.93 1.03 3.77 0.1 0.066 -1.346 0.029 1.000 1.644 0.985 0.96 0.69 0.96 1.02 1.67 0.3 0.066 -3.285 -0.078 1.000 3.408 0.980 0.96 0.06 0.95 1.02 3.48 0.5 0.066 -4.923 -0.192 1.000 5.005 0.987 0.96 0.00 0.93 1.01 5.07 0.1 0.026 -1.391 -0.008 0.448 1.453 0.443 0.93 0.10 0.94 1.01 3.28 0.3 0.026 -3.340 -0.105 0.448 3.364 0.453 0.93 0.00 0.93 0.99 7.43 0.5 0.026 -5.002 -0.233 0.448 5.018 0.494 0.93 0.00 0.88 0.91 10.17 Table B.101: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.8 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.463 0.483 0.467 0.501 0.447 0.491 0.3 0.463 0.521 0.483 0.501 0.419 0.486 0.5 0.463 0.569 0.497 0.501 0.416 0.475 0.1 0.213 0.218 0.213 0.250 0.224 0.244 0.3 0.213 0.227 0.226 0.250 0.208 0.241 0.5 0.213 0.245 0.239 0.250 0.205 0.235 0.1 0.052 0.053 0.052 0.050 0.045 0.049 0.3 0.052 0.058 0.054 0.050 0.041 0.049 0.5 0.052 0.061 0.055 0.050 0.040 0.047 149 Table B.102: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆx ) when ρ = 0.8 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.028 -0.775 0.028 0.709 1.024 0.701 0.95 0.79 0.94 1.01 1.46 0.3 0.028 -1.722 -0.013 0.709 1.840 0.698 0.95 0.26 0.94 1.02 2.64 0.5 0.028 -2.535 -0.046 0.709 2.615 0.691 0.95 0.04 0.93 1.03 3.79 0.1 0.027 -0.677 0.011 0.500 0.826 0.494 0.96 0.70 0.96 1.01 1.67 0.3 0.027 -1.643 -0.037 0.500 1.705 0.492 0.96 0.06 0.96 1.02 3.47 0.5 0.027 -2.464 -0.098 0.500 2.505 0.495 0.96 0.00 0.94 1.01 5.06 0.1 0.007 -0.699 -0.009 0.224 0.731 0.222 0.94 0.10 0.94 1.01 3.29 0.3 0.007 -1.675 -0.056 0.224 1.687 0.227 0.94 0.00 0.93 0.98 7.42 0.5 0.007 -2.504 -0.119 0.224 2.512 0.248 0.94 0.00 0.89 0.90 10.13 Table B.103: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.8 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.082 0.940 1.102 1.040 0.924 0.967 0.3 1.082 0.901 1.241 1.040 0.870 0.957 0.5 1.082 0.943 1.396 1.040 0.864 0.940 0.1 0.533 0.478 0.549 0.510 0.456 0.485 0.3 0.533 0.463 0.606 0.510 0.423 0.478 0.5 0.533 0.481 0.683 0.510 0.416 0.468 0.1 0.105 0.099 0.112 0.100 0.090 0.098 0.3 0.105 0.099 0.129 0.100 0.083 0.097 0.5 0.105 0.102 0.143 0.100 0.081 0.095 150 Table B.104: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.8 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.092 -1.323 -0.220 1.024 1.635 1.008 0.93 0.66 0.90 1.02 1.62 0.3 -0.092 -2.252 -0.377 1.024 2.438 1.048 0.93 0.37 0.85 0.98 2.33 0.5 -0.092 -2.884 -0.522 1.024 3.030 1.101 0.93 0.19 0.82 0.93 2.75 0.1 -0.046 -1.173 -0.181 0.715 1.354 0.719 0.94 0.57 0.91 0.99 1.88 0.3 -0.046 -2.161 -0.356 0.715 2.256 0.777 0.94 0.14 0.86 0.92 2.90 0.5 -0.046 -2.796 -0.503 0.715 2.869 0.849 0.94 0.03 0.80 0.84 3.38 0.1 -0.008 -1.141 -0.095 0.317 1.180 0.327 0.94 0.06 0.93 0.97 3.60 0.3 -0.008 -2.126 -0.253 0.317 2.145 0.401 0.94 0.00 0.82 0.79 5.35 0.5 -0.008 -2.765 -0.416 0.317 2.780 0.517 0.94 0.00 0.67 0.61 5.37 Table B.105: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.8 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.274 0.244 0.280 0.263 0.235 0.245 0.3 0.274 0.236 0.324 0.263 0.221 0.243 0.5 0.274 0.248 0.350 0.263 0.221 0.238 0.1 0.127 0.111 0.126 0.128 0.115 0.122 0.3 0.127 0.107 0.145 0.128 0.107 0.120 0.5 0.127 0.113 0.158 0.128 0.106 0.118 0.1 0.025 0.023 0.026 0.025 0.022 0.025 0.3 0.025 0.023 0.030 0.025 0.021 0.024 0.5 0.025 0.023 0.036 0.025 0.020 0.024 151 Table B.106: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.8 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.021 -0.628 -0.075 0.513 0.793 0.501 0.93 0.71 0.90 1.02 1.58 0.3 -0.021 -1.094 -0.151 0.513 1.191 0.516 0.93 0.39 0.85 0.99 2.31 0.5 -0.021 -1.399 -0.234 0.513 1.476 0.541 0.93 0.22 0.82 0.95 2.73 0.1 -0.016 -0.571 -0.078 0.358 0.664 0.358 0.94 0.58 0.92 1.00 1.86 0.3 -0.016 -1.056 -0.156 0.358 1.105 0.380 0.94 0.14 0.87 0.94 2.91 0.5 -0.016 -1.367 -0.237 0.358 1.405 0.417 0.94 0.04 0.82 0.86 3.37 0.1 -0.006 -0.570 -0.048 0.159 0.590 0.164 0.96 0.04 0.94 0.97 3.60 0.3 -0.006 -1.061 -0.125 0.159 1.071 0.200 0.96 0.00 0.84 0.79 5.36 0.5 -0.006 -1.379 -0.205 0.159 1.386 0.256 0.96 0.00 0.70 0.62 5.41 Table B.107: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.8 over different right censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.003 0.005 0.003 0.003 0.005 0.003 0.3 0.003 0.008 0.003 0.003 0.007 0.003 0.5 0.003 0.012 0.004 0.003 0.011 0.004 0.1 0.001 0.002 0.002 0.001 0.002 0.001 0.3 0.001 0.004 0.002 0.001 0.003 0.002 0.5 0.001 0.005 0.002 0.001 0.005 0.002 0.1 0.000 0.000 0.000 0.000 0.000 0.000 0.3 0.000 0.001 0.000 0.000 0.001 0.000 0.5 0.000 0.001 0.000 0.000 0.001 0.000 152 Table B.108: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.8 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 B.3 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.005 -0.054 -0.010 0.055 0.087 0.055 0.94 0.95 0.93 1.00 1.59 0.3 -0.005 -0.102 -0.024 0.055 0.133 0.061 0.94 0.90 0.96 0.90 2.17 0.5 -0.005 -0.139 -0.045 0.055 0.174 0.076 0.94 0.86 0.96 0.72 2.28 0.1 -0.003 -0.047 -0.007 0.038 0.065 0.038 0.94 0.90 0.95 0.98 1.72 0.3 -0.003 -0.092 -0.020 0.038 0.108 0.045 0.94 0.71 0.95 0.84 2.43 0.5 -0.003 -0.129 -0.038 0.038 0.147 0.057 0.94 0.57 0.93 0.66 2.59 0.1 -0.000 -0.042 -0.004 0.017 0.047 0.018 0.94 0.45 0.95 0.97 2.63 0.3 -0.000 -0.088 -0.016 0.017 0.092 0.023 0.94 0.02 0.88 0.74 3.91 0.5 -0.000 -0.124 -0.034 0.017 0.128 0.038 0.94 0.00 0.57 0.45 3.35 Results in the Presence of Left Censored Observations in the Data Figure B.7: Boxplots for estimators of mean of x (µˆx ) and mean of y (µˆy ), when n = 100, ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 153 Figure B.8: Boxplots for estimators of standard deviation of x (σˆx ) and standard deviation of y (σˆy ), when n = 100, ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 Figure B.9: Boxplots for estimators of correlation between x and y (ρ̂), when n = 100, ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 154 Table B.109: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.460 0.525 0.457 0.545 0.563 0.552 0.3 0.460 0.807 0.483 0.545 0.760 0.561 0.5 0.460 1.204 0.530 0.545 1.113 0.567 0.1 0.215 0.252 0.221 0.267 0.273 0.267 0.3 0.215 0.366 0.232 0.267 0.356 0.271 0.5 0.215 0.536 0.252 0.267 0.509 0.274 0.1 0.052 0.059 0.053 0.052 0.053 0.053 0.3 0.052 0.082 0.055 0.052 0.070 0.053 0.5 0.052 0.109 0.060 0.052 0.099 0.054 Table B.110: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.006 0.006 0.006 0.005 0.005 0.005 0.3 0.006 0.008 0.005 0.005 0.008 0.005 0.5 0.006 0.011 0.006 0.005 0.010 0.006 0.1 0.003 0.003 0.003 0.003 0.003 0.003 0.3 0.003 0.004 0.003 0.003 0.004 0.003 0.5 0.003 0.004 0.003 0.003 0.004 0.003 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.001 0.001 0.001 0.001 0.001 0.5 0.001 0.001 0.001 0.001 0.001 0.001 155 Table B.111: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.955 1.989 1.972 1.984 1.909 1.978 0.3 1.955 2.375 2.046 1.984 1.845 2.017 0.5 1.955 2.686 2.172 1.984 1.999 2.024 0.1 0.958 0.990 0.971 0.997 0.905 0.990 0.3 0.958 1.172 1.016 0.997 0.913 1.003 0.5 0.958 1.355 1.092 0.997 0.986 1.004 0.1 0.206 0.207 0.207 0.200 0.180 0.200 0.3 0.206 0.235 0.216 0.200 0.181 0.202 0.5 0.206 0.275 0.225 0.200 0.195 0.202 Table B.112: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.2 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.099 1.193 0.060 1.412 1.825 1.408 0.95 0.84 0.94 1.00 1.30 0.3 0.099 3.225 0.052 1.412 3.500 1.421 0.95 0.37 0.94 0.99 2.46 0.5 0.099 4.787 0.067 1.412 4.991 1.424 0.95 0.10 0.93 0.99 3.50 0.1 0.064 1.278 0.059 1.001 1.593 0.997 0.96 0.72 0.95 1.00 1.60 0.3 0.064 3.072 0.079 1.001 3.217 1.005 0.96 0.11 0.95 1.00 3.20 0.5 0.064 4.625 0.113 1.001 4.730 1.008 0.96 0.00 0.93 0.99 4.69 0.1 0.027 1.264 0.027 0.448 1.333 0.448 0.94 0.17 0.95 1.00 2.98 0.3 0.027 3.074 0.043 0.448 3.103 0.452 0.94 0.00 0.94 0.99 6.87 0.5 0.027 4.647 0.084 0.448 4.668 0.457 0.94 0.00 0.93 0.98 10.21 156 Table B.113: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.444 0.471 0.441 0.505 0.432 0.503 0.3 0.444 0.525 0.456 0.505 0.461 0.507 0.5 0.444 0.616 0.483 0.505 0.497 0.507 0.1 0.216 0.238 0.221 0.251 0.226 0.248 0.3 0.216 0.272 0.229 0.251 0.227 0.250 0.5 0.216 0.329 0.246 0.251 0.243 0.250 0.1 0.051 0.054 0.051 0.050 0.045 0.050 0.3 0.051 0.059 0.053 0.050 0.045 0.050 0.5 0.051 0.068 0.058 0.050 0.048 0.050 Table B.114: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.2 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.005 0.850 -0.014 0.711 1.074 0.709 0.95 0.73 0.96 1.00 1.51 0.3 0.005 1.598 -0.001 0.711 1.736 0.712 0.95 0.35 0.96 1.00 2.44 0.5 0.005 2.376 0.015 0.711 2.479 0.712 0.95 0.10 0.95 1.00 3.48 0.1 0.015 0.621 0.017 0.501 0.782 0.498 0.97 0.74 0.96 1.01 1.57 0.3 0.015 1.523 0.035 0.501 1.595 0.501 0.97 0.12 0.97 1.00 3.18 0.5 0.015 2.320 0.053 0.501 2.371 0.503 0.97 0.01 0.95 1.00 4.71 0.1 0.001 0.617 0.002 0.223 0.652 0.223 0.93 0.17 0.93 1.00 2.92 0.3 0.001 1.524 0.014 0.223 1.539 0.225 0.93 0.00 0.93 0.99 6.84 0.5 0.001 2.315 0.038 0.223 2.325 0.227 0.93 0.00 0.91 0.98 10.24 157 Table B.115: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.104 1.071 1.117 1.023 0.989 0.989 0.3 1.104 1.108 1.307 1.023 0.964 1.008 0.5 1.104 1.375 1.549 1.023 1.057 1.012 0.1 0.551 0.509 0.586 0.506 0.460 0.495 0.3 0.551 0.539 0.648 0.506 0.466 0.502 0.5 0.551 0.607 0.768 0.506 0.506 0.502 0.1 0.102 0.099 0.111 0.100 0.090 0.100 0.3 0.102 0.102 0.129 0.100 0.091 0.101 0.5 0.102 0.115 0.149 0.100 0.098 0.101 Table B.116: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.2 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.096 -0.865 -0.112 1.016 1.318 1.001 0.94 0.80 0.93 1.02 1.32 0.3 -0.096 -2.022 -0.123 1.016 2.247 1.012 0.94 0.44 0.91 1.00 2.22 0.5 -0.096 -2.677 -0.166 1.016 2.868 1.020 0.94 0.31 0.88 1.00 2.81 0.1 -0.043 -0.976 -0.082 0.713 1.188 0.708 0.94 0.66 0.92 1.01 1.68 0.3 -0.043 -1.932 -0.116 0.713 2.049 0.718 0.94 0.23 0.91 0.99 2.86 0.5 -0.043 -2.581 -0.166 0.713 2.677 0.728 0.94 0.09 0.88 0.98 3.68 0.1 -0.008 -0.991 -0.018 0.317 1.035 0.316 0.95 0.11 0.93 1.00 3.27 0.3 -0.008 -1.929 -0.046 0.317 1.952 0.321 0.95 0.00 0.91 0.99 6.07 0.5 -0.008 -2.569 -0.105 0.317 2.588 0.335 0.95 0.00 0.88 0.95 7.73 158 Table B.117: Empirical variance and asymptotic variance of standard deviation of y (σˆx ) when ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.257 0.258 0.302 0.261 0.224 0.251 0.3 0.257 0.272 0.333 0.261 0.241 0.254 0.5 0.257 0.315 0.394 0.261 0.263 0.254 0.1 0.126 0.125 0.139 0.127 0.115 0.124 0.3 0.126 0.138 0.156 0.127 0.116 0.125 0.5 0.126 0.164 0.193 0.127 0.125 0.125 0.1 0.025 0.024 0.027 0.025 0.023 0.025 0.3 0.025 0.026 0.031 0.025 0.023 0.025 0.5 0.025 0.030 0.038 0.025 0.024 0.025 Table B.118: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆx ) when ρ = 0.2 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.000 -0.656 -0.016 0.511 0.809 0.502 0.94 0.65 0.93 1.02 1.61 0.3 0.000 -1.012 -0.047 0.511 1.124 0.506 0.94 0.44 0.89 1.01 2.22 0.5 0.000 -1.346 -0.079 0.511 1.440 0.510 0.94 0.29 0.86 1.00 2.83 0.1 -0.005 -0.491 -0.033 0.357 0.597 0.354 0.94 0.65 0.93 1.01 1.69 0.3 -0.005 -0.981 -0.065 0.357 1.039 0.359 0.94 0.24 0.91 0.99 2.89 0.5 -0.005 -1.319 -0.091 0.357 1.366 0.365 0.94 0.09 0.86 0.98 3.74 0.1 -0.008 -0.501 -0.014 0.158 0.523 0.158 0.95 0.10 0.95 1.00 3.30 0.3 -0.008 -0.976 -0.035 0.158 0.987 0.163 0.95 0.00 0.91 0.97 6.07 0.5 -0.008 -1.300 -0.071 0.158 1.309 0.174 0.95 0.00 0.85 0.91 7.54 159 Table B.119: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.2 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.020 0.023 0.020 0.019 0.022 0.018 0.3 0.020 0.031 0.019 0.019 0.027 0.018 0.5 0.020 0.039 0.020 0.019 0.034 0.018 0.1 0.010 0.011 0.010 0.009 0.011 0.009 0.3 0.010 0.013 0.010 0.009 0.013 0.009 0.5 0.010 0.017 0.010 0.009 0.017 0.009 0.1 0.002 0.002 0.002 0.002 0.002 0.002 0.3 0.002 0.003 0.002 0.002 0.003 0.002 0.5 0.002 0.003 0.002 0.002 0.003 0.002 Table B.120: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.2 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.005 -0.043 -0.009 0.137 0.153 0.134 0.93 0.92 0.91 1.02 1.14 0.3 -0.005 -0.064 -0.013 0.137 0.176 0.135 0.93 0.90 0.91 1.01 1.31 0.5 -0.005 -0.080 -0.017 0.137 0.202 0.135 0.93 0.90 0.92 1.01 1.49 0.1 -0.004 -0.039 -0.006 0.097 0.110 0.096 0.93 0.93 0.93 1.01 1.15 0.3 -0.004 -0.063 -0.011 0.097 0.131 0.096 0.93 0.91 0.93 1.00 1.37 0.5 -0.004 -0.077 -0.017 0.097 0.151 0.097 0.93 0.91 0.92 0.99 1.55 0.1 0.000 -0.033 -0.001 0.042 0.056 0.042 0.95 0.90 0.95 1.00 1.33 0.3 0.000 -0.064 -0.005 0.042 0.082 0.043 0.95 0.77 0.95 0.99 1.92 0.5 0.000 -0.083 -0.011 0.042 0.101 0.045 0.95 0.69 0.94 0.95 2.25 160 Table B.121: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.4 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.400 0.486 0.402 0.476 0.524 0.482 0.3 0.400 0.733 0.434 0.476 0.709 0.489 0.5 0.400 1.202 0.504 0.476 1.043 0.498 0.1 0.189 0.221 0.197 0.234 0.248 0.233 0.3 0.189 0.328 0.210 0.234 0.334 0.237 0.5 0.189 0.499 0.237 0.234 0.482 0.239 0.1 0.045 0.051 0.046 0.046 0.049 0.046 0.3 0.045 0.074 0.049 0.046 0.065 0.047 0.5 0.045 0.097 0.056 0.046 0.094 0.047 Table B.122: Bias, RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) when ρ = 0.8 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.003 0.611 -0.001 0.690 0.947 0.694 0.96 0.88 0.97 0.99 1.37 0.3 -0.003 1.373 0.031 0.690 1.610 0.700 0.96 0.62 0.97 0.99 2.30 0.5 -0.003 2.073 0.054 0.690 2.311 0.708 0.96 0.47 0.94 0.98 3.27 0.1 0.010 0.544 0.019 0.483 0.738 0.484 0.97 0.81 0.97 1.00 1.53 0.3 0.010 1.318 0.053 0.483 1.439 0.490 0.97 0.35 0.96 0.99 2.94 0.5 0.010 2.027 0.093 0.483 2.143 0.498 0.97 0.17 0.93 0.97 4.30 0.1 -0.003 0.526 0.003 0.214 0.570 0.214 0.95 0.32 0.94 1.00 2.66 0.3 -0.003 1.316 0.032 0.214 1.340 0.218 0.95 0.00 0.94 0.98 6.14 0.5 -0.003 2.036 0.077 0.214 2.059 0.230 0.95 0.00 0.91 0.93 8.97 161 Table B.123: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.4 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.005 0.006 0.005 0.005 0.005 0.005 0.3 0.005 0.008 0.005 0.005 0.007 0.005 0.5 0.005 0.011 0.005 0.005 0.009 0.005 0.1 0.002 0.003 0.002 0.002 0.003 0.002 0.3 0.002 0.003 0.003 0.002 0.003 0.002 0.5 0.002 0.004 0.003 0.002 0.004 0.002 0.1 0.000 0.001 0.000 0.000 0.001 0.000 0.3 0.000 0.001 0.000 0.000 0.001 0.000 0.5 0.000 0.001 0.001 0.000 0.001 0.001 Table B.124: Bias, RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) when ρ = 0.4 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 -0.032 -0.002 0.068 0.080 0.068 0.93 0.91 0.93 1.00 1.18 0.3 0.001 -0.054 -0.007 0.068 0.098 0.069 0.93 0.86 0.94 0.98 1.43 0.5 0.001 -0.069 -0.011 0.068 0.117 0.072 0.93 0.84 0.93 0.95 1.63 0.1 -0.001 -0.031 -0.002 0.047 0.059 0.047 0.94 0.88 0.93 1.00 1.25 0.3 -0.001 -0.054 -0.007 0.047 0.078 0.049 0.94 0.83 0.92 0.97 1.61 0.5 -0.001 -0.070 -0.012 0.047 0.094 0.051 0.94 0.77 0.92 0.93 1.87 0.1 0.000 -0.028 -0.001 0.020 0.036 0.020 0.96 0.77 0.96 1.00 1.80 0.3 0.000 -0.054 -0.005 0.020 0.059 0.021 0.96 0.43 0.95 0.97 2.85 0.5 0.000 -0.071 -0.011 0.020 0.077 0.025 0.96 0.27 0.90 0.81 3.10 162 Table B.125: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.4 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.955 2.026 1.984 1.984 1.777 1.972 0.3 1.955 2.319 2.039 1.984 1.807 1.991 0.5 1.955 2.647 2.153 1.984 1.935 1.975 0.1 0.948 0.981 0.958 0.997 0.894 0.986 0.3 0.948 1.125 1.007 0.997 0.893 0.991 0.5 0.948 1.290 1.072 0.997 0.953 0.981 0.1 0.207 0.207 0.209 0.200 0.178 0.199 0.3 0.207 0.229 0.215 0.200 0.177 0.200 0.5 0.207 0.264 0.226 0.200 0.189 0.197 Table B.126: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.4 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.099 1.626 0.063 1.412 2.102 1.406 0.94 0.75 0.94 1.00 1.50 0.3 0.099 3.455 0.084 1.412 3.708 1.414 0.94 0.30 0.94 1.00 2.62 0.5 0.099 5.068 0.143 1.412 5.256 1.413 0.94 0.06 0.93 1.00 3.72 0.1 0.065 1.404 0.067 1.000 1.693 0.995 0.96 0.69 0.95 1.01 1.70 0.3 0.065 3.303 0.111 1.000 3.436 1.002 0.96 0.08 0.94 1.00 3.43 0.5 0.065 4.929 0.185 1.000 5.025 1.007 0.96 0.00 0.91 0.99 4.99 0.1 0.027 1.387 0.034 0.448 1.449 0.447 0.94 0.12 0.95 1.00 3.24 0.3 0.027 3.293 0.077 0.448 3.320 0.454 0.94 0.00 0.94 0.99 7.32 0.5 0.027 4.931 0.160 0.448 4.950 0.472 0.94 0.00 0.92 0.95 10.49 163 Table B.127: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.4 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.955 2.026 1.984 1.984 1.777 1.972 0.3 1.955 2.319 2.039 1.984 1.807 1.991 0.5 1.955 2.647 2.153 1.984 1.935 1.975 0.1 0.948 0.981 0.958 0.997 0.894 0.986 0.3 0.948 1.125 1.007 0.997 0.893 0.991 0.5 0.948 1.290 1.072 0.997 0.953 0.981 0.1 0.207 0.207 0.209 0.200 0.178 0.199 0.3 0.207 0.229 0.215 0.200 0.177 0.200 0.5 0.207 0.264 0.226 0.200 0.189 0.197 Table B.128: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.4 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.012 0.784 0.001 0.710 1.031 0.706 0.95 0.80 0.96 1.01 1.46 0.3 0.012 1.709 0.025 0.710 1.836 0.707 0.95 0.29 0.95 1.00 2.60 0.5 0.012 2.519 0.052 0.710 2.612 0.707 0.95 0.05 0.94 1.01 3.70 0.1 0.019 0.691 0.024 0.501 0.837 0.498 0.97 0.68 0.96 1.01 1.68 0.3 0.019 1.641 0.053 0.501 1.707 0.500 0.97 0.08 0.97 1.00 3.41 0.5 0.019 2.454 0.094 0.501 2.502 0.503 0.97 0.00 0.95 1.00 4.98 0.1 0.003 0.680 0.007 0.223 0.712 0.223 0.93 0.11 0.93 1.00 3.19 0.3 0.003 1.638 0.032 0.223 1.651 0.225 0.93 0.00 0.93 0.99 7.33 0.5 0.003 2.460 0.076 0.223 2.469 0.234 0.93 0.00 0.90 0.95 10.55 164 Table B.129: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.4 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.096 1.036 1.138 1.025 0.921 0.986 0.3 1.096 1.083 1.265 1.025 0.944 0.996 0.5 1.096 1.268 1.482 1.025 1.022 0.988 0.1 0.546 0.498 0.581 0.507 0.455 0.493 0.3 0.546 0.512 0.629 0.507 0.456 0.496 0.5 0.546 0.569 0.743 0.507 0.489 0.490 0.1 0.103 0.096 0.111 0.100 0.089 0.099 0.3 0.103 0.098 0.128 0.100 0.089 0.100 0.5 0.103 0.104 0.140 0.100 0.095 0.099 Table B.130: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.4 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.096 -1.162 -0.128 1.017 1.507 1.001 0.94 0.70 0.92 1.02 1.51 0.3 -0.096 -2.016 -0.184 1.017 2.238 1.015 0.94 0.43 0.92 1.00 2.21 0.5 -0.096 -2.636 -0.284 1.017 2.824 1.034 0.94 0.31 0.88 0.98 2.73 0.1 -0.044 -1.014 -0.100 0.713 1.218 0.709 0.93 0.66 0.92 1.01 1.72 0.3 -0.044 -1.944 -0.175 0.713 2.058 0.725 0.93 0.21 0.90 0.98 2.84 0.5 -0.044 -2.553 -0.281 0.713 2.647 0.754 0.93 0.08 0.86 0.95 3.51 0.1 -0.008 -1.022 -0.035 0.317 1.065 0.317 0.95 0.09 0.93 1.00 3.36 0.3 -0.008 -1.940 -0.109 0.317 1.963 0.334 0.95 0.00 0.90 0.95 5.87 0.5 -0.008 -2.552 -0.226 0.317 2.571 0.387 0.95 0.00 0.84 0.82 6.64 165 Table B.131: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.4 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.264 0.262 0.299 0.261 0.233 0.249 0.3 0.264 0.271 0.338 0.261 0.235 0.250 0.5 0.264 0.315 0.395 0.261 0.254 0.248 0.1 0.127 0.125 0.138 0.127 0.113 0.124 0.3 0.127 0.132 0.157 0.127 0.113 0.124 0.5 0.127 0.157 0.184 0.127 0.121 0.122 0.1 0.024 0.022 0.025 0.025 0.022 0.025 0.3 0.024 0.025 0.030 0.025 0.022 0.025 0.5 0.024 0.028 0.036 0.025 0.024 0.025 Table B.132: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.4 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.004 -0.557 -0.036 0.511 0.737 0.501 0.93 0.73 0.91 1.02 1.47 0.3 -0.004 -1.013 -0.087 0.511 1.124 0.507 0.93 0.43 0.87 1.01 2.22 0.5 -0.004 -1.326 -0.131 0.511 1.419 0.515 0.93 0.29 0.85 0.99 2.75 0.1 -0.008 -0.514 -0.044 0.357 0.615 0.354 0.94 0.63 0.92 1.01 1.74 0.3 -0.008 -0.986 -0.095 0.357 1.042 0.364 0.94 0.22 0.89 0.98 2.86 0.5 -0.008 -1.304 -0.153 0.357 1.349 0.381 0.94 0.08 0.85 0.94 3.54 0.1 -0.007 -0.516 -0.021 0.158 0.537 0.159 0.96 0.07 0.95 1.00 3.38 0.3 -0.007 -0.982 -0.065 0.158 0.993 0.171 0.96 0.00 0.90 0.93 5.82 0.5 -0.007 -1.290 -0.128 0.158 1.299 0.202 0.96 0.00 0.82 0.78 6.43 166 Table B.133: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.4 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.015 0.018 0.015 0.015 0.018 0.014 0.3 0.015 0.028 0.015 0.015 0.023 0.014 0.5 0.015 0.039 0.016 0.015 0.030 0.015 0.1 0.007 0.010 0.008 0.007 0.009 0.007 0.3 0.007 0.012 0.008 0.007 0.011 0.007 0.5 0.007 0.016 0.008 0.007 0.015 0.007 0.1 0.002 0.002 0.002 0.001 0.002 0.001 0.3 0.002 0.002 0.002 0.001 0.002 0.001 0.5 0.002 0.003 0.002 0.001 0.003 0.002 Table B.134: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.4 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.007 -0.070 -0.011 0.121 0.152 0.119 0.93 0.93 0.93 1.02 1.27 0.3 -0.007 -0.111 -0.019 0.121 0.189 0.121 0.93 0.88 0.93 1.00 1.56 0.5 -0.007 -0.142 -0.029 0.121 0.223 0.124 0.93 0.86 0.93 0.98 1.80 0.1 -0.005 -0.062 -0.008 0.085 0.112 0.085 0.93 0.90 0.93 1.00 1.33 0.3 -0.005 -0.109 -0.017 0.085 0.152 0.087 0.93 0.86 0.93 0.98 1.76 0.5 -0.005 -0.140 -0.026 0.085 0.185 0.089 0.93 0.80 0.93 0.95 2.07 0.1 0.000 -0.056 -0.002 0.037 0.070 0.038 0.95 0.74 0.95 1.00 1.86 0.3 0.000 -0.107 -0.010 0.037 0.117 0.039 0.95 0.40 0.94 0.96 3.01 0.5 0.000 -0.142 -0.020 0.037 0.151 0.044 0.95 0.24 0.93 0.86 3.46 167 Table B.135: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.6 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.304 0.371 0.313 0.363 0.416 0.367 0.3 0.304 0.593 0.350 0.363 0.588 0.375 0.5 0.304 0.928 0.429 0.363 0.874 0.385 0.1 0.145 0.175 0.155 0.178 0.196 0.178 0.3 0.145 0.258 0.168 0.178 0.275 0.182 0.5 0.145 0.371 0.202 0.178 0.402 0.184 0.1 0.034 0.040 0.035 0.035 0.039 0.035 0.3 0.034 0.058 0.040 0.035 0.054 0.036 0.5 0.034 0.080 0.045 0.035 0.079 0.036 Table B.136: Bias, RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.008 0.482 0.002 0.602 0.806 0.606 0.97 0.90 0.96 0.99 1.33 0.3 -0.008 1.128 0.039 0.602 1.363 0.614 0.97 0.68 0.96 0.98 2.22 0.5 -0.008 1.714 0.074 0.602 1.952 0.625 0.97 0.54 0.92 0.96 3.12 0.1 0.006 0.435 0.019 0.422 0.621 0.422 0.97 0.84 0.96 1.00 1.47 0.3 0.006 1.066 0.055 0.422 1.188 0.430 0.97 0.46 0.95 0.98 2.77 0.5 0.006 1.665 0.107 0.422 1.782 0.443 0.97 0.25 0.91 0.95 4.03 0.1 -0.003 0.418 0.007 0.187 0.462 0.187 0.94 0.42 0.94 1.00 2.47 0.3 -0.003 1.057 0.042 0.187 1.082 0.193 0.94 0.01 0.93 0.97 5.60 0.5 -0.003 1.665 0.086 0.187 1.689 0.209 0.94 0.00 0.90 0.90 8.09 168 Table B.137: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.6 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.004 0.005 0.004 0.004 0.004 0.004 0.3 0.004 0.007 0.004 0.004 0.006 0.004 0.5 0.004 0.010 0.005 0.004 0.007 0.004 0.1 0.002 0.002 0.002 0.002 0.002 0.002 0.3 0.002 0.003 0.002 0.002 0.003 0.002 0.5 0.002 0.004 0.002 0.002 0.003 0.002 0.1 0.000 0.000 0.000 0.000 0.000 0.000 0.3 0.000 0.001 0.000 0.000 0.001 0.000 0.5 0.000 0.001 0.000 0.000 0.001 0.000 Table B.138: Bias, RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 -0.033 -0.002 0.059 0.074 0.059 0.94 0.91 0.94 1.00 1.24 0.3 0.001 -0.062 -0.008 0.059 0.097 0.061 0.94 0.83 0.92 0.98 1.61 0.5 0.001 -0.081 -0.015 0.059 0.118 0.064 0.94 0.78 0.91 0.92 1.83 0.1 -0.000 -0.032 -0.003 0.041 0.055 0.041 0.93 0.87 0.92 1.00 1.33 0.3 -0.000 -0.061 -0.009 0.041 0.079 0.042 0.93 0.78 0.92 0.98 1.88 0.5 -0.000 -0.081 -0.016 0.041 0.100 0.045 0.93 0.69 0.90 0.91 2.20 0.1 0.000 -0.030 -0.002 0.017 0.036 0.017 0.96 0.67 0.95 0.99 2.09 0.3 0.000 -0.060 -0.007 0.017 0.064 0.019 0.96 0.25 0.93 0.92 3.39 0.5 0.000 -0.081 -0.014 0.017 0.085 0.024 0.96 0.10 0.86 0.71 3.48 169 Table B.139: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.6 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.954 2.042 1.985 1.984 1.740 1.959 0.3 1.954 2.270 2.034 1.984 1.732 1.959 0.5 1.954 2.492 2.110 1.984 1.821 1.924 0.1 0.937 0.953 0.947 0.996 0.876 0.981 0.3 0.937 1.123 1.009 0.996 0.857 0.977 0.5 0.937 1.184 1.038 0.996 0.897 0.960 0.1 0.208 0.208 0.211 0.200 0.174 0.198 0.3 0.208 0.229 0.216 0.200 0.170 0.197 0.5 0.208 0.251 0.225 0.200 0.177 0.193 Table B.140: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.097 1.699 0.076 1.401 2.151 1.402 0.94 0.73 0.94 1.01 1.53 0.3 0.097 3.587 0.127 1.401 3.821 1.405 0.94 0.24 0.94 1.00 2.72 0.5 0.097 5.231 0.213 1.401 5.402 1.404 0.94 0.05 0.93 1.01 3.85 0.1 0.066 1.467 0.079 0.970 1.740 0.993 0.95 0.66 0.95 1.01 1.75 0.3 0.066 3.418 0.148 0.970 3.541 1.000 0.95 0.06 0.93 1.00 3.54 0.5 0.066 5.080 0.251 0.970 5.167 1.011 0.95 0.00 0.91 0.99 5.11 0.1 0.026 1.453 0.044 0.457 1.512 0.447 0.94 0.08 0.93 1.00 3.38 0.3 0.026 3.407 0.114 0.457 3.432 0.458 0.94 0.00 0.93 0.98 7.49 0.5 0.026 5.083 0.230 0.457 5.100 0.495 0.94 0.00 0.90 0.90 10.30 170 Table B.141: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.6 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 3*500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.454 0.469 0.452 0.503 0.438 0.495 0.3 0.454 0.527 0.467 0.503 0.431 0.491 0.5 0.454 0.583 0.483 0.503 0.454 0.483 0.1 0.212 0.223 0.215 0.250 0.218 0.245 0.3 0.212 0.257 0.219 0.250 0.213 0.243 0.5 0.212 0.287 0.235 0.250 0.221 0.238 0.1 0.052 0.054 0.052 0.050 0.043 0.049 0.3 0.052 0.060 0.054 0.050 0.042 0.049 0.5 0.052 0.065 0.057 0.050 0.044 0.048 Table B.142: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.019 0.832 0.015 0.710 1.063 0.703 0.96 0.77 0.96 1.01 1.51 0.3 0.019 1.784 0.050 0.710 1.901 0.703 0.96 0.23 0.95 1.01 2.71 0.5 0.019 2.608 0.094 0.710 2.694 0.701 0.96 0.04 0.94 1.01 3.84 0.1 0.023 0.733 0.034 0.501 0.869 0.496 0.97 0.64 0.97 1.01 1.75 0.3 0.023 1.703 0.075 0.501 1.764 0.499 0.97 0.06 0.96 1.00 3.54 0.5 0.023 2.534 0.135 0.501 2.578 0.506 0.97 0.00 0.95 0.99 5.09 0.1 0.005 0.719 0.015 0.223 0.749 0.223 0.93 0.09 0.93 1.00 3.36 0.3 0.005 1.699 0.054 0.223 1.711 0.228 0.93 0.00 0.93 0.98 7.51 0.5 0.005 2.535 0.111 0.223 2.543 0.245 0.93 0.00 0.90 0.91 10.36 171 Table B.143: Empirical variance and asymptotic variance of standard deviation of x (µˆx ) when ρ = 0.6 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.088 1.008 1.127 1.030 0.906 0.980 0.3 1.088 1.076 1.234 1.030 0.908 0.980 0.5 1.088 1.195 1.421 1.030 0.963 0.962 0.1 0.540 0.503 0.565 0.507 0.447 0.490 0.3 0.540 0.493 0.621 0.507 0.438 0.489 0.5 0.540 0.529 0.722 0.507 0.461 0.480 0.1 0.104 0.094 0.111 0.100 0.087 0.099 0.3 0.104 0.093 0.124 0.100 0.085 0.099 0.5 0.104 0.096 0.137 0.100 0.089 0.096 Table B.144: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (µˆx ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.095 -1.219 -0.159 1.019 1.546 1.003 0.94 0.68 0.91 1.02 1.54 0.3 -0.095 -2.086 -0.263 1.019 2.294 1.024 0.94 0.40 0.89 1.00 2.24 0.5 -0.095 -2.690 -0.409 1.019 2.864 1.063 0.94 0.27 0.84 0.96 2.69 0.1 -0.045 -1.074 -0.126 0.714 1.265 0.712 0.93 0.61 0.91 1.00 1.78 0.3 -0.045 -2.006 -0.244 0.714 2.112 0.740 0.93 0.17 0.88 0.96 2.85 0.5 -0.045 -2.613 -0.385 0.714 2.699 0.792 0.93 0.06 0.85 0.90 3.41 0.1 -0.008 -1.079 -0.058 0.317 1.119 0.320 0.94 0.07 0.93 0.99 3.50 0.3 -0.008 -2.007 -0.179 0.317 2.029 0.361 0.94 0.00 0.88 0.88 5.62 0.5 -0.008 -2.616 -0.336 0.317 2.633 0.457 0.94 0.00 0.77 0.69 5.76 172 Table B.145: Empirical variance and asymptotic variance of standard deviation of y (µˆy ) when ρ = 0.6 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.271 0.262 0.302 0.261 0.228 0.247 0.3 0.271 0.270 0.337 0.261 0.226 0.246 0.5 0.271 0.301 0.390 0.261 0.240 0.242 0.1 0.127 0.123 0.137 0.127 0.111 0.123 0.3 0.127 0.129 0.152 0.127 0.109 0.122 0.5 0.127 0.142 0.178 0.127 0.114 0.119 0.1 0.024 0.021 0.025 0.025 0.022 0.025 0.3 0.024 0.023 0.029 0.025 0.021 0.025 0.5 0.024 0.025 0.034 0.025 0.022 0.024 Table B.146: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (µˆy ) when ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.011 -0.593 -0.058 0.511 0.762 0.501 0.93 0.72 0.91 1.02 1.52 0.3 -0.011 -1.050 -0.127 0.511 1.153 0.512 0.93 0.39 0.87 1.00 2.25 0.5 -0.011 -1.351 -0.197 0.511 1.437 0.530 0.93 0.26 0.83 0.97 2.71 0.1 -0.011 -0.547 -0.062 0.357 0.641 0.356 0.94 0.59 0.92 1.00 1.80 0.3 -0.011 -1.018 -0.132 0.357 1.070 0.373 0.94 0.18 0.88 0.96 2.87 0.5 -0.011 -1.331 -0.214 0.357 1.373 0.406 0.94 0.07 0.83 0.88 3.38 0.1 -0.007 -0.546 -0.034 0.158 0.565 0.161 0.96 0.04 0.94 0.98 3.52 0.3 -0.007 -1.014 -0.102 0.158 1.025 0.187 0.96 0.00 0.86 0.85 5.49 0.5 -0.007 -1.320 -0.179 0.158 1.328 0.237 0.96 0.00 0.74 0.67 5.61 173 Table B.147: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.6 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.009 0.013 0.009 0.009 0.012 0.009 0.3 0.009 0.020 0.009 0.009 0.017 0.009 0.5 0.009 0.030 0.010 0.009 0.022 0.009 0.1 0.004 0.006 0.004 0.004 0.006 0.004 0.3 0.004 0.009 0.005 0.004 0.008 0.004 0.5 0.004 0.012 0.005 0.004 0.011 0.005 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.002 0.001 0.001 0.002 0.001 0.5 0.001 0.002 0.001 0.001 0.002 0.001 ˆ when Table B.148: Bias, RMSE, 95% CP, and relative RMSE of correlation (rho) ρ = 0.6 over sample sizes 50, 100, and 500 and right censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.007 -0.074 -0.012 0.094 0.133 0.094 0.92 0.92 0.93 1.01 1.42 0.3 -0.007 -0.129 -0.023 0.094 0.183 0.097 0.92 0.89 0.93 0.97 1.88 0.5 -0.007 -0.169 -0.037 0.094 0.225 0.103 0.92 0.83 0.95 0.91 2.18 0.1 -0.004 -0.065 -0.008 0.066 0.100 0.065 0.94 0.89 0.94 1.01 1.53 0.3 -0.004 -0.122 -0.019 0.066 0.151 0.069 0.94 0.79 0.93 0.95 2.20 0.5 -0.004 -0.162 -0.033 0.066 0.192 0.075 0.94 0.68 0.93 0.87 2.55 0.1 -0.000 -0.060 -0.004 0.028 0.069 0.029 0.95 0.57 0.95 0.99 2.42 0.3 -0.000 -0.118 -0.014 0.028 0.125 0.033 0.95 0.12 0.93 0.86 3.79 0.5 -0.000 -0.162 -0.027 0.028 0.168 0.040 0.95 0.04 0.89 0.70 4.16 174 Table B.149: Empirical variance and asymptotic variance of intercept(βˆ0 ) when ρ = 0.8 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.170 0.211 0.183 0.204 0.246 0.208 0.3 0.170 0.351 0.228 0.204 0.362 0.220 0.5 0.170 0.584 0.283 0.204 0.549 0.235 0.1 0.083 0.104 0.092 0.100 0.116 0.101 0.3 0.083 0.149 0.111 0.100 0.168 0.106 0.5 0.083 0.228 0.138 0.100 0.253 0.112 0.1 0.019 0.022 0.020 0.020 0.023 0.020 0.3 0.019 0.033 0.025 0.020 0.033 0.021 0.5 0.019 0.045 0.029 0.020 0.049 0.022 ˆ 0 ) when Table B.150: Bias, RMSE, 95% CP, and relative RMSE of intercept (beta ρ = 0.8 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.010 0.273 0.004 0.452 0.566 0.456 0.96 0.93 0.96 0.99 1.24 0.3 -0.010 0.659 0.030 0.452 0.892 0.470 0.96 0.80 0.94 0.96 1.90 0.5 -0.010 1.053 0.069 0.452 1.288 0.490 0.96 0.69 0.91 0.92 2.63 0.1 0.001 0.248 0.015 0.316 0.421 0.318 0.97 0.87 0.96 1.00 1.33 0.3 0.001 0.624 0.045 0.316 0.747 0.328 0.97 0.67 0.94 0.96 2.28 0.5 0.001 1.017 0.097 0.316 1.135 0.349 0.97 0.44 0.88 0.91 3.25 0.1 -0.003 0.236 0.007 0.140 0.280 0.141 0.95 0.64 0.94 1.00 1.99 0.3 -0.003 0.615 0.039 0.140 0.642 0.149 0.95 0.07 0.90 0.94 4.30 0.5 -0.003 1.005 0.081 0.140 1.029 0.169 0.95 0.00 0.88 0.83 6.09 175 Table B.151: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.8 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.002 0.003 0.002 0.002 0.003 0.002 0.3 0.002 0.004 0.003 0.002 0.004 0.002 0.5 0.002 0.006 0.003 0.002 0.005 0.002 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.002 0.001 0.001 0.002 0.001 0.5 0.001 0.003 0.002 0.001 0.002 0.001 0.1 0.000 0.000 0.000 0.000 0.000 0.000 0.3 0.000 0.000 0.000 0.000 0.000 0.000 0.5 0.000 0.000 0.000 0.000 0.000 0.000 ˆ 0 ) when ρ = 0.8 Table B.152: Bias, RMSE, 95% CP, and relative RMSE of slope (beta over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 -0.024 -0.002 0.044 0.056 0.045 0.94 0.90 0.93 0.97 1.25 0.3 0.001 -0.047 -0.009 0.044 0.075 0.047 0.94 0.85 0.92 0.94 1.62 0.5 0.001 -0.065 -0.019 0.044 0.094 0.053 0.94 0.80 0.90 0.83 1.79 0.1 0.000 -0.023 -0.003 0.030 0.041 0.032 0.93 0.88 0.93 0.95 1.31 0.3 0.000 -0.045 -0.009 0.030 0.060 0.033 0.93 0.79 0.90 0.91 1.84 0.5 0.000 -0.064 -0.019 0.030 0.079 0.038 0.93 0.71 0.85 0.79 2.07 0.1 -0.000 -0.022 -0.002 0.014 0.026 0.014 0.96 0.71 0.94 0.99 1.80 0.3 -0.000 -0.044 -0.008 0.014 0.048 0.016 0.96 0.27 0.89 0.86 2.90 0.5 -0.000 -0.063 -0.017 0.014 0.066 0.022 0.96 0.11 0.77 0.63 2.94 176 Table B.153: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.8 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.950 2.056 1.980 1.985 1.695 1.947 0.3 1.950 2.172 2.027 1.985 1.631 1.932 0.5 1.950 2.348 2.050 1.985 1.667 1.897 0.1 0.924 0.954 0.935 0.996 0.856 0.974 0.3 0.924 1.083 0.987 0.996 0.811 0.964 0.5 0.924 1.139 1.028 0.996 0.823 0.947 0.1 0.209 0.208 0.212 0.200 0.170 0.197 0.3 0.209 0.225 0.214 0.200 0.161 0.195 0.5 0.209 0.243 0.220 0.200 0.163 0.190 Table B.154: Bias, RMSE, 95% CP, and relative RMSE of mean of x (mu ˆ x ) when ρ = 0.8 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.095 1.695 0.089 1.412 2.137 1.398 0.95 0.72 0.94 1.01 1.53 0.3 0.095 3.582 0.169 1.412 3.803 1.400 0.95 0.23 0.94 1.01 2.72 0.5 0.095 5.201 0.179 1.412 5.359 1.389 0.95 0.05 0.93 1.02 3.86 0.1 0.066 1.473 0.094 1.000 1.739 0.991 0.96 0.64 0.95 1.01 1.75 0.3 0.066 3.405 0.187 1.000 3.522 1.000 0.96 0.05 0.93 1.00 3.52 0.5 0.066 5.054 0.295 1.000 5.135 1.017 0.96 0.00 0.92 0.98 5.05 0.1 0.026 1.441 0.055 0.448 1.499 0.447 0.93 0.08 0.93 1.00 3.35 0.3 0.026 3.392 0.149 0.448 3.415 0.466 0.93 0.00 0.92 0.96 7.33 0.5 0.026 5.053 0.274 0.448 5.069 0.515 0.93 0.00 0.89 0.87 9.84 177 Table B.155: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.8 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.463 0.482 0.464 0.501 0.426 0.490 0.3 0.463 0.518 0.475 0.501 0.408 0.485 0.5 0.463 0.567 0.478 0.501 0.415 0.474 0.1 0.213 0.220 0.217 0.250 0.213 0.243 0.3 0.213 0.245 0.221 0.250 0.202 0.240 0.5 0.213 0.269 0.234 0.250 0.204 0.235 0.1 0.052 0.053 0.053 0.050 0.043 0.049 0.3 0.052 0.058 0.055 0.050 0.040 0.049 0.5 0.052 0.062 0.056 0.050 0.041 0.047 Table B.156: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.8 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.028 0.836 0.031 0.709 1.061 0.700 0.95 0.73 0.95 1.01 1.51 0.3 0.028 1.781 0.072 0.709 1.892 0.700 0.95 0.22 0.94 1.01 2.70 0.5 0.028 2.598 0.083 0.709 2.676 0.694 0.95 0.04 0.93 1.02 3.86 0.1 0.027 0.737 0.045 0.500 0.869 0.495 0.96 0.64 0.96 1.01 1.75 0.3 0.027 1.698 0.094 0.500 1.756 0.499 0.96 0.04 0.95 1.00 3.52 0.5 0.027 2.525 0.156 0.500 2.565 0.509 0.96 0.00 0.93 0.98 5.04 0.1 0.007 0.717 0.023 0.224 0.746 0.223 0.94 0.09 0.92 1.00 3.35 0.3 0.007 1.690 0.073 0.224 1.702 0.232 0.94 0.00 0.92 0.96 7.34 0.5 0.007 2.524 0.135 0.224 2.532 0.256 0.94 0.00 0.89 0.87 9.89 178 Table B.157: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.8 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.082 0.991 1.124 1.040 0.889 0.974 0.3 1.082 1.031 1.245 1.040 0.861 0.966 0.5 1.082 1.166 1.462 1.040 0.888 0.949 0.1 0.533 0.491 0.550 0.510 0.438 0.487 0.3 0.533 0.473 0.623 0.510 0.416 0.482 0.5 0.533 0.512 0.707 0.510 0.424 0.474 0.1 0.105 0.094 0.110 0.100 0.086 0.098 0.3 0.105 0.089 0.127 0.100 0.081 0.097 0.5 0.105 0.091 0.140 0.100 0.082 0.095 Table B.158: Bias, RMSE, 95% CP, and relative RMSE ofstandard deviation of x (σˆx ) when ρ = 0.8 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.092 -1.280 -0.190 1.024 1.590 1.005 0.93 0.65 0.90 1.02 1.58 0.3 -0.092 -2.194 -0.333 1.024 2.382 1.038 0.93 0.36 0.88 0.99 2.30 0.5 -0.092 -2.804 -0.480 1.024 2.959 1.086 0.93 0.21 0.83 0.94 2.73 0.1 -0.046 -1.139 -0.160 0.715 1.317 0.716 0.94 0.58 0.91 1.00 1.84 0.3 -0.046 -2.107 -0.310 0.715 2.204 0.760 0.94 0.13 0.86 0.94 2.90 0.5 -0.046 -2.733 -0.447 0.715 2.810 0.821 0.94 0.03 0.84 0.87 3.42 0.1 -0.008 -1.130 -0.086 0.317 1.168 0.325 0.94 0.04 0.93 0.97 3.59 0.3 -0.008 -2.108 -0.239 0.317 2.127 0.393 0.94 0.00 0.84 0.81 5.41 0.5 -0.008 -2.739 -0.397 0.317 2.754 0.503 0.94 0.00 0.71 0.63 5.48 179 Table B.159: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.8 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.274 0.265 0.301 0.263 0.223 0.245 0.3 0.274 0.270 0.341 0.263 0.215 0.243 0.5 0.274 0.293 0.381 0.263 0.221 0.237 0.1 0.127 0.124 0.137 0.128 0.109 0.122 0.3 0.127 0.122 0.154 0.128 0.103 0.120 0.5 0.127 0.133 0.182 0.128 0.105 0.117 0.1 0.025 0.022 0.025 0.025 0.021 0.025 0.3 0.025 0.021 0.029 0.025 0.020 0.024 0.5 0.025 0.022 0.033 0.025 0.020 0.024 Table B.160: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.8 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.021 -0.633 -0.083 0.513 0.790 0.502 0.93 0.68 0.89 1.02 1.57 0.3 -0.021 -1.098 -0.158 0.513 1.192 0.517 0.93 0.35 0.87 0.99 2.30 0.5 -0.021 -1.410 -0.241 0.513 1.487 0.544 0.93 0.23 0.82 0.94 2.74 0.1 -0.016 -0.578 -0.080 0.358 0.666 0.358 0.94 0.57 0.93 1.00 1.86 0.3 -0.016 -1.065 -0.161 0.358 1.113 0.382 0.94 0.13 0.86 0.94 2.91 0.5 -0.016 -1.387 -0.244 0.358 1.424 0.421 0.94 0.04 0.81 0.85 3.39 0.1 -0.006 -0.572 -0.046 0.159 0.590 0.164 0.96 0.02 0.92 0.97 3.61 0.3 -0.006 -1.060 -0.128 0.159 1.069 0.202 0.96 0.00 0.82 0.79 5.31 0.5 -0.006 -1.379 -0.206 0.159 1.386 0.257 0.96 0.00 0.71 0.62 5.39 180 Table B.161: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.8 over different left censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n p 50 100 500 Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.003 0.005 0.003 0.003 0.005 0.003 0.3 0.003 0.009 0.003 0.003 0.007 0.003 0.5 0.003 0.015 0.003 0.003 0.011 0.004 0.1 0.001 0.002 0.001 0.001 0.002 0.001 0.3 0.001 0.004 0.002 0.001 0.003 0.002 0.5 0.001 0.006 0.002 0.001 0.005 0.002 0.1 0.000 0.001 0.000 0.000 0.000 0.000 0.3 0.000 0.001 0.000 0.000 0.001 0.000 0.5 0.000 0.001 0.000 0.000 0.001 0.000 Table B.162: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.8 over sample sizes 50, 100, and 500 and left censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.005 -0.053 -0.010 0.055 0.086 0.055 0.94 0.94 0.94 1.01 1.58 0.3 -0.005 -0.098 -0.023 0.055 0.130 0.061 0.94 0.88 0.97 0.90 2.13 0.5 -0.005 -0.135 -0.043 0.055 0.170 0.075 0.94 0.83 0.96 0.74 2.27 0.1 -0.003 -0.046 -0.007 0.038 0.065 0.038 0.94 0.89 0.95 0.98 1.70 0.3 -0.003 -0.091 -0.019 0.038 0.107 0.043 0.94 0.72 0.95 0.87 2.49 0.5 -0.003 -0.127 -0.037 0.038 0.145 0.057 0.94 0.60 0.91 0.66 2.56 0.1 -0.000 -0.042 -0.004 0.017 0.047 0.018 0.94 0.45 0.95 0.97 2.63 0.3 -0.000 -0.088 -0.016 0.017 0.091 0.023 0.94 0.04 0.89 0.74 3.89 0.5 -0.000 -0.124 -0.034 0.017 0.127 0.038 0.94 0.00 0.58 0.46 3.35 181 B.4 Results in the Presence of Interval Censored Observations in the Data Figure B.10: Boxplots for estimates of mean of x (µˆx ) and mean of y (µˆy ), when n = 100, ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 Figure B.11: Boxplots for estimates of standard deviation of x (σˆx ) and standard deviation if y (σˆy ), when n = 100, ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 182 Figure B.12: Boxplots for estimates of correlation between x and y (ρ̂), when n = 100, ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 Table B.163: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.460 0.490 0.460 0.545 0.627 0.545 0.3 0.460 0.630 0.458 0.545 0.832 0.545 0.5 0.460 0.949 0.463 0.545 1.086 0.544 0.1 0.215 0.239 0.215 0.267 0.307 0.267 0.3 0.215 0.342 0.216 0.267 0.407 0.267 0.5 0.215 0.583 0.217 0.267 0.533 0.267 0.1 0.052 0.061 0.052 0.052 0.060 0.052 0.3 0.052 0.121 0.052 0.052 0.120 0.052 0.5 0.052 0.289 0.053 0.052 0.210 0.052 183 Table B.164: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.006 0.006 0.006 0.005 0.006 0.005 0.3 0.006 0.007 0.006 0.005 0.007 0.005 0.5 0.006 0.010 0.006 0.005 0.009 0.005 0.1 0.003 0.003 0.003 0.003 0.003 0.003 0.3 0.003 0.004 0.003 0.003 0.003 0.003 0.5 0.003 0.005 0.003 0.003 0.004 0.003 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.001 0.001 0.001 0.001 0.001 0.5 0.001 0.001 0.001 0.001 0.001 0.001 Table B.165: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.955 2.172 1.956 1.984 2.307 1.944 0.3 1.955 2.918 1.969 1.984 3.139 1.943 0.5 1.955 4.371 2.011 1.984 4.216 1.939 0.1 0.958 1.067 0.958 0.997 1.158 0.987 0.3 0.958 1.545 0.959 0.997 1.585 0.987 0.5 0.958 2.591 0.969 0.997 2.163 0.986 0.1 0.206 0.232 0.206 0.200 0.232 0.200 0.3 0.206 0.523 0.208 0.200 0.476 0.200 0.5 0.206 1.459 0.206 0.200 0.859 0.199 184 Table B.166: Bias, RMSE, 95% CP, and relative RMSE of mean x (µˆx ) when ρ = 0.2 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.099 0.066 0.100 1.412 1.520 1.398 0.95 0.94 0.94 1.01 1.09 0.3 0.099 -0.307 0.100 1.412 1.798 1.397 0.95 0.94 0.94 1.01 1.29 0.5 0.099 -1.223 0.092 1.412 2.390 1.395 0.95 0.89 0.93 1.01 1.71 0.1 0.064 0.050 0.064 1.001 1.077 0.996 0.96 0.96 0.96 1.01 1.08 0.3 0.064 -0.329 0.064 1.001 1.301 0.995 0.96 0.94 0.95 1.01 1.31 0.5 0.064 -1.279 0.057 1.001 1.949 0.994 0.96 0.82 0.95 1.01 1.96 0.1 0.027 -0.018 0.027 0.448 0.482 0.448 0.94 0.94 0.94 1.00 1.08 0.3 0.027 -0.438 0.027 0.448 0.818 0.448 0.94 0.88 0.94 1.00 1.83 0.5 0.027 -1.459 0.023 0.448 1.729 0.447 0.94 0.60 0.94 1.00 3.87 Table B.167: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.444 0.470 0.444 0.505 0.587 0.495 0.3 0.444 0.616 0.443 0.505 0.802 0.495 0.5 0.444 0.975 0.450 0.505 1.081 0.494 0.1 0.216 0.239 0.217 0.251 0.291 0.248 0.3 0.216 0.345 0.216 0.251 0.398 0.248 0.5 0.216 0.614 0.218 0.251 0.541 0.248 0.1 0.051 0.060 0.051 0.050 0.058 0.050 0.3 0.051 0.135 0.051 0.050 0.119 0.050 0.5 0.051 0.356 0.051 0.050 0.217 0.050 185 Table B.168: Bias, RMSE, 95% CP, and relative RMSE of mean y (µˆy ) when ρ = 0.2 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.005 -0.021 0.005 0.711 0.766 0.704 0.95 0.96 0.95 1.01 1.09 0.3 0.005 -0.211 0.005 0.711 0.920 0.704 0.95 0.96 0.95 1.01 1.31 0.5 0.005 -0.689 0.004 0.711 1.247 0.703 0.95 0.90 0.95 1.01 1.77 0.1 0.015 -0.002 0.015 0.501 0.539 0.499 0.97 0.97 0.97 1.01 1.08 0.3 0.015 -0.190 0.017 0.501 0.659 0.499 0.97 0.94 0.97 1.00 1.32 0.5 0.015 -0.675 0.015 0.501 0.998 0.498 0.97 0.79 0.97 1.01 2.00 0.1 0.001 -0.022 0.001 0.223 0.242 0.223 0.93 0.94 0.94 1.00 1.08 0.3 0.001 -0.216 0.001 0.223 0.407 0.223 0.93 0.87 0.93 1.00 1.82 0.5 0.001 -0.711 -0.000 0.223 0.850 0.223 0.93 0.62 0.93 1.00 3.81 Table B.169: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.104 1.179 1.082 1.023 1.194 0.972 0.3 1.104 1.453 1.087 1.023 1.639 0.971 0.5 1.104 2.163 1.089 1.023 2.230 0.969 0.1 0.551 0.590 0.546 0.506 0.589 0.494 0.3 0.551 0.758 0.545 0.506 0.810 0.493 0.5 0.551 1.130 0.551 0.506 1.112 0.493 0.1 0.102 0.116 0.102 0.100 0.117 0.100 0.3 0.102 0.159 0.102 0.100 0.239 0.100 0.5 0.102 0.499 0.103 0.100 0.432 0.100 186 Table B.170: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.2 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.096 0.150 -0.196 1.016 1.103 1.005 0.94 0.95 0.93 1.01 1.10 0.3 -0.096 0.610 -0.199 1.016 1.418 1.006 0.94 0.97 0.93 1.01 1.41 0.5 -0.096 0.891 -0.210 1.016 1.739 1.007 0.94 0.95 0.93 1.01 1.73 0.1 -0.043 0.198 -0.093 0.713 0.793 0.709 0.94 0.96 0.93 1.01 1.12 0.3 -0.043 0.689 -0.095 0.713 1.134 0.709 0.94 0.92 0.93 1.01 1.60 0.5 -0.043 1.033 -0.100 0.713 1.476 0.709 0.94 0.89 0.93 1.01 2.08 0.1 -0.008 0.233 -0.018 0.317 0.413 0.316 0.95 0.89 0.95 1.00 1.31 0.3 -0.008 0.712 -0.018 0.317 0.863 0.316 0.95 0.75 0.95 1.00 2.73 0.5 -0.008 0.996 -0.022 0.317 1.193 0.317 0.95 0.64 0.95 1.00 3.77 Table B.171: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.257 0.288 0.252 0.261 0.304 0.248 0.3 0.257 0.340 0.252 0.261 0.419 0.247 0.5 0.257 0.489 0.253 0.261 0.572 0.247 0.1 0.126 0.143 0.125 0.127 0.148 0.124 0.3 0.126 0.181 0.125 0.127 0.203 0.124 0.5 0.126 0.295 0.124 0.127 0.278 0.124 0.1 0.025 0.027 0.025 0.025 0.029 0.025 0.3 0.025 0.038 0.025 0.025 0.060 0.025 0.5 0.025 0.113 0.025 0.025 0.109 0.025 187 Table B.172: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.2 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.000 0.120 -0.050 0.511 0.564 0.500 0.94 0.95 0.93 1.02 1.13 0.3 0.000 0.367 -0.051 0.511 0.744 0.500 0.94 0.96 0.93 1.02 1.49 0.5 0.000 0.523 -0.054 0.511 0.920 0.500 0.94 0.95 0.93 1.02 1.84 0.1 -0.005 0.111 -0.030 0.357 0.400 0.354 0.94 0.96 0.93 1.01 1.13 0.3 -0.005 0.356 -0.030 0.357 0.574 0.354 0.94 0.93 0.93 1.01 1.62 0.5 -0.005 0.518 -0.031 0.357 0.740 0.354 0.94 0.88 0.93 1.01 2.09 0.1 -0.008 0.118 -0.013 0.158 0.207 0.158 0.95 0.90 0.95 1.00 1.31 0.3 -0.008 0.363 -0.013 0.158 0.438 0.158 0.95 0.75 0.95 1.00 2.77 0.5 -0.008 0.522 -0.014 0.158 0.618 0.158 0.95 0.62 0.95 1.00 3.90 Table B.173: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.2 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.020 0.022 0.020 0.019 0.021 0.018 0.3 0.020 0.026 0.020 0.019 0.025 0.018 0.5 0.020 0.034 0.020 0.019 0.032 0.018 0.1 0.010 0.011 0.010 0.009 0.010 0.009 0.3 0.010 0.013 0.010 0.009 0.013 0.009 0.5 0.010 0.017 0.010 0.009 0.016 0.009 0.1 0.002 0.002 0.002 0.002 0.002 0.002 0.3 0.002 0.003 0.002 0.002 0.003 0.002 0.5 0.002 0.004 0.002 0.002 0.003 0.002 188 Table B.174: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.2 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.005 0.004 -0.005 0.137 0.143 0.134 0.93 0.93 0.92 1.02 1.07 0.3 -0.005 0.022 -0.005 0.137 0.160 0.134 0.93 0.91 0.91 1.02 1.20 0.5 -0.005 0.032 -0.006 0.137 0.181 0.134 0.93 0.90 0.92 1.02 1.35 0.1 -0.004 0.005 -0.004 0.097 0.101 0.096 0.93 0.93 0.93 1.01 1.06 0.3 -0.004 0.023 -0.005 0.097 0.114 0.096 0.93 0.93 0.93 1.01 1.19 0.5 -0.004 0.039 -0.005 0.097 0.131 0.096 0.93 0.91 0.93 1.01 1.37 0.1 0.000 0.009 0.000 0.042 0.046 0.042 0.95 0.94 0.95 1.00 1.08 0.3 0.000 0.028 0.000 0.042 0.057 0.042 0.95 0.91 0.95 1.00 1.35 0.5 0.000 0.041 -0.000 0.042 0.069 0.042 0.95 0.85 0.95 1.00 1.63 Table B.175: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.4 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.400 0.421 0.417 0.476 0.544 0.521 0.3 0.400 0.552 0.401 0.476 0.706 0.537 0.5 0.400 0.769 0.406 0.476 0.902 0.538 0.1 0.189 0.210 0.203 0.234 0.267 0.250 0.3 0.189 0.291 0.193 0.234 0.348 0.260 0.5 0.189 0.456 0.193 0.234 0.442 0.262 0.1 0.045 0.052 0.047 0.046 0.053 0.048 0.3 0.045 0.093 0.047 0.046 0.102 0.051 0.5 0.045 0.192 0.047 0.046 0.174 0.052 189 Table B.176: Bias, RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) when ρ = 0.4 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.003 -0.050 0.012 0.690 0.739 0.722 1.00 0.97 0.97 0.96 1.02 0.3 -0.003 -0.233 0.037 0.690 0.872 0.734 1.00 0.96 0.98 0.94 1.19 0.5 -0.003 -0.550 0.044 0.690 1.097 0.735 1.00 0.90 0.97 0.94 1.49 0.1 0.010 -0.019 0.029 0.483 0.518 0.501 1.00 0.98 0.96 0.97 1.03 0.3 0.010 -0.201 0.051 0.483 0.624 0.513 1.00 0.95 0.98 0.94 1.22 0.5 0.010 -0.552 0.060 0.483 0.864 0.516 1.00 0.84 0.97 0.94 1.68 0.1 -0.003 -0.041 0.019 0.214 0.233 0.221 1.00 0.94 0.93 0.97 1.06 0.3 -0.003 -0.213 0.044 0.214 0.384 0.230 1.00 0.90 0.95 0.93 1.67 0.5 -0.003 -0.550 0.050 0.214 0.690 0.233 1.00 0.69 0.96 0.92 2.97 Table B.177: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.4 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.005 0.005 0.006 0.005 0.005 0.038 0.3 0.005 0.006 0.005 0.005 0.006 0.005 0.5 0.005 0.008 0.005 0.005 0.008 0.005 0.1 0.002 0.002 0.003 0.002 0.002 0.002 0.3 0.002 0.003 0.002 0.002 0.003 0.002 0.5 0.002 0.004 0.002 0.002 0.004 0.003 0.1 0.000 0.000 0.001 0.000 0.001 0.000 0.3 0.000 0.001 0.001 0.000 0.001 0.001 0.5 0.000 0.001 0.000 0.000 0.001 0.001 190 Table B.178: Bias, RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) when ρ = 0.4 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 0.009 -0.006 0.068 0.071 0.195 0.93 0.93 0.91 0.35 0.37 0.3 0.001 0.023 -0.013 0.068 0.081 0.073 0.93 0.93 0.95 0.93 1.11 0.5 0.001 0.035 -0.016 0.068 0.094 0.074 0.93 0.91 0.95 0.92 1.27 0.1 -0.001 0.007 -0.007 0.047 0.049 0.049 0.94 0.94 0.91 0.97 1.02 0.3 -0.001 0.021 -0.014 0.047 0.058 0.051 0.94 0.93 0.93 0.92 1.14 0.5 -0.001 0.034 -0.017 0.047 0.069 0.053 0.94 0.89 0.94 0.89 1.31 0.1 0.000 0.008 -0.007 0.020 0.024 0.021 0.96 0.95 0.88 0.94 1.11 0.3 0.000 0.025 -0.015 0.020 0.035 0.027 0.96 0.78 0.89 0.74 1.29 0.5 0.000 0.036 -0.017 0.020 0.045 0.028 0.96 0.67 0.89 0.71 1.59 Table B.179: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.4 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.955 2.104 1.963 1.984 2.321 2.043 0.3 1.955 2.840 1.969 1.984 3.190 2.104 0.5 1.955 4.585 2.005 1.984 4.277 2.114 0.1 0.948 1.031 0.947 0.997 1.167 1.044 0.3 0.948 1.516 0.957 0.997 1.614 1.065 0.5 0.948 2.598 0.967 0.997 2.184 1.073 0.1 0.207 0.245 0.207 0.200 0.234 0.208 0.3 0.207 0.565 0.208 0.200 0.483 0.216 0.5 0.207 1.538 0.206 0.200 0.869 0.217 191 Table B.180: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.4 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.099 0.092 0.106 1.412 1.526 1.433 0.94 0.95 0.94 0.99 1.06 0.3 0.099 -0.324 0.099 1.412 1.815 1.454 0.94 0.95 0.95 0.97 1.25 0.5 0.099 -1.405 0.096 1.412 2.500 1.457 0.94 0.87 0.95 0.97 1.72 0.1 0.065 0.019 0.065 1.000 1.081 1.024 0.96 0.96 0.96 0.98 1.06 0.3 0.065 -0.419 0.065 1.000 1.338 1.034 0.96 0.93 0.96 0.97 1.29 0.5 0.065 -1.525 0.061 1.000 2.123 1.037 0.96 0.78 0.96 0.96 2.05 0.1 0.027 -0.033 0.027 0.448 0.485 0.457 0.94 0.94 0.94 0.98 1.06 0.3 0.027 -0.505 0.028 0.448 0.860 0.466 0.94 0.86 0.95 0.96 1.85 0.5 0.027 -1.707 0.030 0.448 1.945 0.467 0.94 0.54 0.96 0.96 4.16 Table B.181: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.4 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.448 0.464 0.444 0.504 0.588 0.531 0.3 0.448 0.623 0.447 0.504 0.807 0.540 0.5 0.448 0.971 0.456 0.504 1.086 0.539 0.1 0.214 0.232 0.214 0.251 0.293 0.261 0.3 0.214 0.344 0.214 0.251 0.404 0.269 0.5 0.214 0.631 0.215 0.251 0.546 0.270 0.1 0.051 0.060 0.051 0.050 0.058 0.052 0.3 0.051 0.143 0.051 0.050 0.121 0.054 0.5 0.051 0.386 0.052 0.050 0.218 0.054 192 Table B.182: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.4 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.012 -0.013 0.012 0.710 0.767 0.729 0.95 0.97 0.96 0.97 1.05 0.3 0.012 -0.246 0.010 0.710 0.931 0.735 0.95 0.97 0.96 0.97 1.27 0.5 0.012 -0.784 0.009 0.710 1.304 0.734 0.95 0.87 0.96 0.97 1.78 0.1 0.019 0.003 0.019 0.501 0.541 0.511 0.97 0.97 0.98 0.98 1.06 0.3 0.019 -0.232 0.020 0.501 0.676 0.519 0.97 0.94 0.97 0.97 1.30 0.5 0.019 -0.801 0.019 0.501 1.090 0.520 0.97 0.75 0.97 0.96 2.10 0.1 0.003 -0.025 0.003 0.223 0.243 0.227 0.93 0.94 0.93 0.98 1.07 0.3 0.003 -0.251 0.004 0.223 0.429 0.232 0.93 0.83 0.94 0.96 1.85 0.5 0.003 -0.836 0.004 0.223 0.958 0.233 0.93 0.54 0.94 0.96 4.11 Table B.183: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.4 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.096 1.196 1.670 1.025 1.204 1.021 0.3 1.096 1.477 1.301 1.025 1.672 1.052 0.5 1.096 2.124 1.192 1.025 2.273 1.057 0.1 0.546 0.599 1.303 0.507 0.594 0.522 0.3 0.546 0.748 0.731 0.507 0.826 0.532 0.5 0.546 1.161 0.640 0.507 1.125 0.536 0.1 0.103 0.117 0.304 0.100 0.117 0.104 0.3 0.103 0.170 0.205 0.100 0.243 0.108 0.5 0.103 0.527 0.138 0.100 0.437 0.109 193 Table B.184: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.4 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.096 0.182 0.024 1.017 1.112 1.011 0.94 0.95 0.87 1.01 1.10 0.3 -0.096 0.717 0.193 1.017 1.478 1.044 0.94 0.95 0.93 0.97 1.42 0.5 -0.096 1.026 0.223 1.017 1.824 1.052 0.94 0.96 0.95 0.97 1.73 0.1 -0.044 0.241 0.151 0.713 0.808 0.738 0.93 0.95 0.84 0.97 1.09 0.3 -0.044 0.800 0.283 0.713 1.211 0.782 0.93 0.90 0.90 0.91 1.55 0.5 -0.044 1.129 0.325 0.713 1.549 0.801 0.93 0.85 0.93 0.89 1.93 0.1 -0.008 0.264 0.194 0.317 0.433 0.377 0.95 0.88 0.73 0.84 1.15 0.3 -0.008 0.800 0.391 0.317 0.940 0.511 0.95 0.67 0.70 0.62 1.84 0.5 -0.008 1.104 0.419 0.317 1.287 0.533 0.95 0.54 0.73 0.59 2.41 Table B.185: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.4 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.264 0.297 0.457 0.261 0.305 0.266 0.3 0.264 0.355 0.319 0.261 0.423 0.270 0.5 0.264 0.506 0.280 0.261 0.577 0.270 0.1 0.127 0.141 0.238 0.127 0.149 0.131 0.3 0.127 0.179 0.163 0.127 0.207 0.135 0.5 0.127 0.309 0.138 0.127 0.281 0.135 0.1 0.024 0.027 0.066 0.025 0.029 0.026 0.3 0.024 0.039 0.046 0.025 0.061 0.027 0.5 0.024 0.118 0.032 0.025 0.110 0.027 194 Table B.186: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.4 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.004 0.126 0.108 0.511 0.567 0.527 0.93 0.95 0.89 0.97 1.08 0.3 -0.004 0.392 0.165 0.511 0.759 0.545 0.93 0.95 0.94 0.94 1.39 0.5 -0.004 0.560 0.165 0.511 0.944 0.545 0.93 0.95 0.95 0.94 1.73 0.1 -0.008 0.132 0.087 0.357 0.408 0.372 0.94 0.96 0.85 0.96 1.10 0.3 -0.008 0.401 0.171 0.357 0.606 0.405 0.94 0.91 0.91 0.88 1.50 0.5 -0.008 0.565 0.182 0.357 0.775 0.410 0.94 0.84 0.94 0.87 1.89 0.1 -0.007 0.131 0.078 0.158 0.216 0.179 0.96 0.90 0.76 0.88 1.20 0.3 -0.007 0.401 0.184 0.158 0.470 0.246 0.96 0.69 0.73 0.64 1.91 0.5 -0.007 0.566 0.205 0.158 0.656 0.263 0.96 0.56 0.74 0.60 2.49 Table B.187: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.4 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.015 0.017 0.018 0.015 0.016 0.015 0.3 0.015 0.019 0.015 0.015 0.019 0.015 0.5 0.015 0.023 0.015 0.015 0.023 0.015 0.1 0.007 0.008 0.009 0.007 0.008 0.007 0.3 0.007 0.009 0.008 0.007 0.009 0.007 0.5 0.007 0.012 0.007 0.007 0.011 0.007 0.1 0.002 0.002 0.002 0.001 0.002 0.001 0.3 0.002 0.002 0.002 0.001 0.002 0.002 0.5 0.002 0.003 0.002 0.001 0.002 0.002 195 Table B.188: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.4 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.007 0.009 -0.031 0.121 0.126 0.124 0.93 0.91 0.89 0.97 1.01 0.3 -0.007 0.038 -0.037 0.121 0.142 0.127 0.93 0.89 0.93 0.96 1.12 0.5 -0.007 0.058 -0.040 0.121 0.162 0.128 0.93 0.87 0.93 0.94 1.26 0.1 -0.005 0.010 -0.019 0.085 0.089 0.087 0.93 0.94 0.91 0.98 1.02 0.3 -0.005 0.039 -0.032 0.085 0.103 0.092 0.93 0.91 0.93 0.92 1.12 0.5 -0.005 0.064 -0.037 0.085 0.123 0.094 0.93 0.84 0.94 0.91 1.31 0.1 0.000 0.016 -0.014 0.037 0.042 0.040 0.95 0.93 0.90 0.94 1.05 0.3 0.000 0.049 -0.030 0.037 0.065 0.049 0.95 0.73 0.87 0.77 1.33 0.5 0.000 0.070 -0.034 0.037 0.084 0.052 0.95 0.60 0.87 0.72 1.62 Table B.189: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.6 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.304 0.329 0.303 0.363 0.410 0.363 0.3 0.304 0.419 0.302 0.363 0.517 0.363 0.5 0.304 0.585 0.303 0.363 0.633 0.364 0.1 0.145 0.165 0.146 0.178 0.201 0.178 0.3 0.145 0.216 0.145 0.178 0.254 0.178 0.5 0.145 0.291 0.149 0.178 0.312 0.178 0.1 0.034 0.039 0.034 0.035 0.039 0.035 0.3 0.034 0.058 0.034 0.035 0.050 0.035 0.5 0.034 0.096 0.036 0.035 0.061 0.035 196 Table B.190: Bias, RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) when ρ = 0.6 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.008 -0.040 -0.008 0.602 0.641 0.602 1.00 0.96 0.97 1.00 1.06 0.3 -0.008 -0.211 -0.007 0.602 0.749 0.603 1.00 0.96 0.96 1.00 1.24 0.5 -0.008 -0.451 -0.009 0.602 0.914 0.603 1.00 0.90 0.97 1.00 1.52 0.1 0.006 -0.027 0.006 0.422 0.449 0.422 1.00 0.97 0.97 1.00 1.07 0.3 0.006 -0.184 0.007 0.422 0.536 0.422 1.00 0.95 0.97 1.00 1.27 0.5 0.006 -0.432 0.009 0.422 0.706 0.422 1.00 0.89 0.97 1.00 1.67 0.1 -0.003 -0.041 -0.003 0.187 0.203 0.187 1.00 0.95 0.94 1.00 1.09 0.3 -0.003 -0.183 -0.004 0.187 0.288 0.187 1.00 0.83 0.94 1.00 1.54 0.5 -0.003 -0.436 -0.005 0.187 0.501 0.187 1.00 0.55 0.94 1.00 2.68 Table B.191: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.6 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.004 0.004 0.004 0.004 0.004 0.004 0.3 0.004 0.005 0.004 0.004 0.004 0.004 0.5 0.004 0.006 0.004 0.004 0.005 0.004 0.1 0.002 0.002 0.002 0.002 0.002 0.002 0.3 0.002 0.002 0.002 0.002 0.002 0.002 0.5 0.002 0.003 0.002 0.002 0.003 0.002 0.1 0.000 0.000 0.000 0.000 0.000 0.000 0.3 0.000 0.000 0.000 0.000 0.000 0.000 0.5 0.000 0.001 0.000 0.000 0.001 0.000 197 Table B.192: Bias, RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) when ρ = 0.6 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 0.010 0.001 0.059 0.062 0.059 0.94 0.93 0.93 1.00 1.04 0.3 0.001 0.028 0.001 0.059 0.071 0.059 0.94 0.90 0.93 1.00 1.20 0.5 0.001 0.041 -0.000 0.059 0.084 0.059 0.94 0.88 0.93 1.00 1.41 0.1 -0.000 0.009 -0.000 0.041 0.043 0.041 0.93 0.95 0.93 1.00 1.05 0.3 -0.000 0.025 -0.000 0.041 0.052 0.041 0.93 0.90 0.94 1.00 1.27 0.5 -0.000 0.038 -0.000 0.041 0.063 0.041 0.93 0.84 0.93 1.00 1.53 0.1 0.000 0.009 0.000 0.017 0.020 0.017 0.96 0.94 0.96 1.00 1.13 0.3 0.000 0.027 0.000 0.017 0.034 0.017 0.96 0.69 0.95 1.00 1.95 0.5 0.000 0.040 0.000 0.017 0.045 0.017 0.96 0.52 0.96 1.00 2.62 Table B.193: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.6 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.954 2.026 1.954 1.984 2.336 1.944 0.3 1.954 2.702 1.966 1.984 3.255 1.943 0.5 1.954 4.234 1.992 1.984 4.381 1.939 0.1 0.937 0.984 0.936 0.996 1.174 0.986 0.3 0.937 1.434 0.946 0.996 1.643 0.986 0.5 0.937 2.634 0.952 0.996 2.218 0.985 0.1 0.208 0.242 0.208 0.200 0.236 0.200 0.3 0.208 0.612 0.209 0.200 0.329 0.200 0.5 0.208 1.749 0.208 0.200 0.444 0.199 198 Table B.194: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.6 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.097 0.039 0.097 1.412 1.529 1.398 0.94 0.97 0.94 1.01 1.09 0.3 0.097 -0.421 0.097 1.412 1.853 1.397 0.94 0.95 0.94 1.01 1.33 0.5 0.097 -1.634 0.096 1.412 2.655 1.396 0.94 0.86 0.94 1.01 1.90 0.1 0.066 0.008 0.066 1.000 1.083 0.995 0.95 0.97 0.95 1.00 1.09 0.3 0.066 -0.462 0.064 1.000 1.363 0.995 0.95 0.96 0.95 1.01 1.37 0.5 0.066 -1.702 0.060 1.000 2.262 0.994 0.95 0.72 0.95 1.01 2.27 0.1 0.026 -0.042 0.026 0.448 0.487 0.447 0.94 0.94 0.94 1.00 1.09 0.3 0.026 -0.589 0.028 0.448 0.823 0.448 0.94 0.72 0.93 1.00 1.84 0.5 0.026 -1.917 0.026 0.448 2.030 0.447 0.94 0.35 0.94 1.00 4.54 Table B.195: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.6 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.454 0.486 0.454 0.503 0.593 0.493 0.3 0.454 0.632 0.452 0.503 0.825 0.493 0.5 0.454 0.993 0.456 0.503 1.108 0.491 0.1 0.212 0.232 0.212 0.250 0.295 0.248 0.3 0.212 0.352 0.213 0.250 0.412 0.248 0.5 0.212 0.630 0.214 0.250 0.557 0.248 0.1 0.052 0.061 0.052 0.050 0.059 0.050 0.3 0.052 0.154 0.052 0.050 0.082 0.050 0.5 0.052 0.421 0.053 0.050 0.111 0.050 199 Table B.196: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.6 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.019 -0.004 0.019 0.710 0.770 0.703 0.96 0.96 0.95 1.01 1.10 0.3 0.019 -0.273 0.019 0.710 0.949 0.702 0.96 0.97 0.95 1.01 1.35 0.5 0.019 -0.892 0.014 0.710 1.379 0.701 0.96 0.87 0.95 1.01 1.97 0.1 0.023 -0.000 0.023 0.501 0.543 0.498 0.97 0.96 0.97 1.01 1.09 0.3 0.023 -0.257 0.024 0.501 0.691 0.498 0.97 0.94 0.96 1.00 1.39 0.5 0.023 -0.882 0.026 0.501 1.155 0.498 0.97 0.72 0.96 1.00 2.32 0.1 0.005 -0.027 0.005 0.223 0.244 0.223 0.93 0.94 0.93 1.00 1.09 0.3 0.005 -0.293 0.005 0.223 0.410 0.223 0.93 0.71 0.93 1.00 1.84 0.5 0.005 -0.957 0.003 0.223 1.013 0.223 0.93 0.35 0.93 1.00 4.54 Table B.197: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.6 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.088 1.208 1.067 1.030 1.218 0.972 0.3 1.088 1.551 1.069 1.030 1.718 0.971 0.5 1.088 2.277 1.077 1.030 2.352 0.970 0.1 0.540 0.592 0.535 0.507 0.599 0.493 0.3 0.540 0.737 0.535 0.507 0.844 0.493 0.5 0.540 1.160 0.541 0.507 1.148 0.493 0.1 0.104 0.117 0.104 0.100 0.118 0.100 0.3 0.104 0.171 0.104 0.100 0.166 0.100 0.5 0.104 0.577 0.105 0.100 0.223 0.100 200 Table B.198: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.6 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.095 0.220 -0.194 1.019 1.126 1.005 0.94 0.96 0.93 1.01 1.12 0.3 -0.095 0.844 -0.199 1.019 1.559 1.005 0.94 0.96 0.93 1.01 1.55 0.5 -0.095 1.210 -0.209 1.019 1.954 1.007 0.94 0.94 0.93 1.01 1.94 0.1 -0.045 0.270 -0.095 0.714 0.820 0.709 0.93 0.95 0.92 1.01 1.16 0.3 -0.045 0.915 -0.097 0.714 1.296 0.709 0.93 0.89 0.92 1.01 1.83 0.5 -0.045 1.291 -0.101 0.714 1.678 0.709 0.93 0.83 0.92 1.01 2.37 0.1 -0.008 0.314 -0.017 0.317 0.466 0.316 0.94 0.86 0.94 1.00 1.47 0.3 -0.008 0.938 -0.017 0.317 1.022 0.316 0.94 0.35 0.94 1.00 3.23 0.5 -0.008 1.287 -0.019 0.317 1.371 0.316 0.94 0.26 0.94 1.00 4.33 Table B.199: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.6 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.271 0.302 0.265 0.261 0.309 0.247 0.3 0.271 0.380 0.265 0.261 0.436 0.246 0.5 0.271 0.536 0.268 0.261 0.595 0.246 0.1 0.127 0.142 0.126 0.127 0.150 0.124 0.3 0.127 0.186 0.125 0.127 0.211 0.124 0.5 0.127 0.322 0.127 0.127 0.288 0.124 0.1 0.024 0.027 0.024 0.025 0.030 0.025 0.3 0.024 0.040 0.024 0.025 0.041 0.025 0.5 0.024 0.132 0.024 0.025 0.056 0.025 201 Table B.200: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.6 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.011 0.148 -0.061 0.511 0.576 0.500 0.93 0.95 0.92 1.02 1.15 0.3 -0.011 0.461 -0.062 0.511 0.805 0.500 0.93 0.94 0.92 1.02 1.61 0.5 -0.011 0.640 -0.071 0.511 1.002 0.501 0.93 0.95 0.92 1.02 2.00 0.1 -0.011 0.148 -0.037 0.357 0.415 0.354 0.94 0.95 0.94 1.01 1.17 0.3 -0.011 0.463 -0.037 0.357 0.652 0.354 0.94 0.88 0.94 1.01 1.84 0.5 -0.011 0.655 -0.036 0.357 0.847 0.354 0.94 0.77 0.93 1.01 2.39 0.1 -0.007 0.152 -0.012 0.158 0.229 0.158 0.96 0.88 0.96 1.00 1.45 0.3 -0.007 0.468 -0.012 0.158 0.511 0.158 0.96 0.34 0.96 1.00 3.23 0.5 -0.007 0.650 -0.013 0.158 0.692 0.158 0.96 0.25 0.95 1.00 4.37 Table B.201: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.6 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.009 0.009 0.009 0.009 0.009 0.008 0.3 0.009 0.010 0.009 0.009 0.010 0.008 0.5 0.009 0.011 0.009 0.009 0.012 0.009 0.1 0.004 0.005 0.004 0.004 0.004 0.004 0.3 0.004 0.005 0.004 0.004 0.005 0.004 0.5 0.004 0.006 0.004 0.004 0.006 0.004 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.001 0.001 0.001 0.001 0.001 0.5 0.001 0.002 0.001 0.001 0.001 0.001 202 Table B.202: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.6 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.007 0.010 -0.007 0.094 0.097 0.092 0.92 0.91 0.92 1.02 1.05 0.3 -0.007 0.045 -0.008 0.094 0.110 0.092 0.92 0.84 0.91 1.02 1.20 0.5 -0.007 0.069 -0.010 0.094 0.129 0.093 0.92 0.78 0.92 1.02 1.39 0.1 -0.004 0.013 -0.004 0.066 0.068 0.065 0.94 0.92 0.94 1.01 1.04 0.3 -0.004 0.047 -0.004 0.066 0.084 0.065 0.94 0.83 0.94 1.01 1.29 0.5 -0.004 0.072 -0.005 0.066 0.103 0.065 0.94 0.72 0.94 1.01 1.59 0.1 -0.000 0.018 -0.000 0.028 0.035 0.028 0.95 0.89 0.95 1.00 1.24 0.3 -0.000 0.054 0.000 0.028 0.062 0.028 0.95 0.55 0.94 1.00 2.18 0.5 -0.000 0.077 0.000 0.028 0.084 0.028 0.95 0.34 0.95 1.00 2.95 Table B.203: Empirical variance and asymptotic variance of intercept (βˆ0 ) when ρ = 0.8 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.170 0.188 0.136 0.204 0.226 0.211 0.3 0.170 0.229 0.171 0.204 0.273 0.222 0.5 0.170 0.304 0.178 0.204 0.322 0.230 0.1 0.083 0.090 0.086 0.100 0.111 0.105 0.3 0.083 0.116 0.086 0.100 0.135 0.107 0.5 0.083 0.147 0.090 0.100 0.160 0.109 0.1 0.019 0.022 0.020 0.020 0.022 0.020 0.3 0.019 0.030 0.020 0.020 0.026 0.021 0.5 0.019 0.043 0.020 0.020 0.031 0.021 203 Table B.204: Bias, RMSE, 95% CP, and relative RMSE of intercept (βˆ0 ) when ρ = 0.8 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n p Bias CLS 50 100 500 RMSE CP RRMSE Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.010 -0.039 -0.002 0.452 0.477 0.459 1.00 0.96 0.99 0.98 1.04 0.3 -0.010 -0.123 0.031 0.452 0.537 0.473 1.00 0.96 0.97 0.96 1.14 0.5 -0.010 -0.261 0.037 0.452 0.624 0.481 1.00 0.93 0.97 0.94 1.30 0.1 0.001 -0.031 0.009 0.316 0.335 0.323 1.00 0.96 0.97 0.98 1.04 0.3 0.001 -0.110 0.024 0.316 0.383 0.327 1.00 0.96 0.97 0.97 1.17 0.5 0.001 -0.230 0.057 0.316 0.461 0.335 1.00 0.92 0.97 0.94 1.38 0.1 -0.003 -0.031 0.006 0.140 0.151 0.143 1.00 0.94 0.96 0.98 1.06 0.3 -0.003 -0.119 0.011 0.140 0.202 0.144 1.00 0.86 0.95 0.97 1.40 0.5 -0.003 -0.254 0.018 0.140 0.309 0.146 1.00 0.68 0.95 0.96 2.12 Table B.205: Empirical variance and asymptotic variance of slope (βˆ1 ) when ρ = 0.8 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.002 0.002 0.002 0.002 0.002 0.002 0.3 0.002 0.002 0.002 0.002 0.002 0.002 0.5 0.002 0.003 0.003 0.002 0.003 0.002 0.1 0.001 0.001 0.001 0.001 0.001 0.001 0.3 0.001 0.001 0.001 0.001 0.001 0.001 0.5 0.001 0.002 0.001 0.001 0.001 0.001 0.1 0.000 0.000 0.000 0.000 0.000 0.000 0.3 0.000 0.000 0.000 0.000 0.000 0.000 0.5 0.000 0.000 0.000 0.000 0.000 0.000 204 Table B.206: Bias, RMSE, 95% CP, and relative RMSE of slope (βˆ1 ) when ρ = 0.8 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.001 0.009 -0.002 0.044 0.046 0.045 0.94 0.93 0.92 0.97 1.02 0.3 0.001 0.022 -0.010 0.044 0.052 0.048 0.94 0.92 0.93 0.91 1.08 0.5 0.001 0.029 -0.014 0.044 0.059 0.050 0.94 0.88 0.93 0.87 1.17 0.1 0.000 0.007 -0.002 0.030 0.032 0.032 0.93 0.94 0.90 0.95 1.02 0.3 0.000 0.019 -0.006 0.030 0.038 0.032 0.93 0.91 0.93 0.93 1.19 0.5 0.000 0.027 -0.009 0.030 0.044 0.033 0.93 0.84 0.92 0.91 1.33 0.1 -0.000 0.007 -0.003 0.014 0.016 0.014 0.96 0.94 0.92 0.98 1.08 0.3 -0.000 0.019 -0.005 0.014 0.024 0.015 0.96 0.73 0.93 0.95 1.58 0.5 -0.000 0.026 -0.005 0.014 0.030 0.015 0.96 0.55 0.93 0.94 1.97 Table B.207: Empirical variance and asymptotic variance of mean of x (µˆx ) when ρ = 0.8 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.950 2.007 1.525 1.985 2.361 1.950 0.3 1.950 2.533 2.003 1.985 3.337 1.957 0.5 1.950 3.870 2.078 1.985 4.491 1.891 0.1 0.924 0.960 0.925 0.996 1.187 0.995 0.3 0.924 1.422 0.924 0.996 1.682 0.993 0.5 0.924 2.753 0.930 0.996 2.257 0.973 0.1 0.209 0.253 0.209 0.200 0.239 0.201 0.3 0.209 0.680 0.210 0.200 0.338 0.202 0.5 0.209 1.923 0.212 0.200 0.452 0.200 205 Table B.208: Bias, RMSE, 95% CP, and relative RMSE of mean of x (µˆx ) when ρ = 0.8 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.095 0.042 0.094 1.412 1.537 1.400 0.95 0.97 0.98 1.01 1.10 0.3 0.095 -0.513 0.088 1.412 1.897 1.402 0.95 0.96 0.94 1.01 1.35 0.5 0.095 -1.813 0.224 1.412 2.789 1.393 0.95 0.85 0.93 1.01 2.00 0.1 0.066 0.017 0.069 1.000 1.090 1.000 0.96 0.98 0.96 1.00 1.09 0.3 0.066 -0.546 0.075 1.000 1.407 0.999 0.96 0.94 0.95 1.00 1.41 0.5 0.066 -1.937 0.195 1.000 2.452 1.005 0.96 0.70 0.96 1.00 2.44 0.1 0.026 -0.041 0.025 0.448 0.490 0.449 0.93 0.93 0.94 1.00 1.09 0.3 0.026 -0.630 0.025 0.448 0.857 0.450 0.93 0.69 0.94 1.00 1.91 0.5 0.026 -2.070 0.074 0.448 2.177 0.454 0.93 0.31 0.93 0.99 4.80 Table B.209: Empirical variance and asymptotic variance of mean of y (µˆy ) when ρ = 0.8 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.463 0.470 0.349 0.501 0.597 0.492 0.3 0.463 0.596 0.473 0.501 0.846 0.491 0.5 0.463 0.965 0.484 0.501 1.134 0.480 0.1 0.213 0.224 0.213 0.250 0.298 0.250 0.3 0.213 0.345 0.215 0.250 0.422 0.249 0.5 0.213 0.664 0.221 0.250 0.567 0.245 0.1 0.052 0.062 0.052 0.050 0.060 0.050 0.3 0.052 0.170 0.052 0.050 0.084 0.050 0.5 0.052 0.477 0.051 0.050 0.113 0.050 206 Table B.210: Bias, RMSE, 95% CP, and relative RMSE of mean of y (µˆy ) when ρ = 0.8 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 0.028 0.000 0.035 0.709 0.773 0.702 0.95 0.97 0.98 1.01 1.10 0.3 0.028 -0.277 0.033 0.709 0.961 0.701 0.95 0.97 0.94 1.01 1.37 0.5 0.028 -0.950 0.081 0.709 1.427 0.697 0.95 0.86 0.93 1.02 2.05 0.1 0.027 -0.005 0.027 0.500 0.546 0.501 0.96 0.97 0.96 1.00 1.09 0.3 0.027 -0.279 0.035 0.500 0.707 0.500 0.96 0.95 0.96 1.00 1.41 0.5 0.027 -0.963 0.106 0.500 1.223 0.506 0.96 0.69 0.95 0.99 2.41 0.1 0.007 -0.028 0.007 0.224 0.246 0.224 0.94 0.93 0.94 1.00 1.10 0.3 0.007 -0.325 0.008 0.224 0.436 0.225 0.94 0.69 0.94 1.00 1.94 0.5 0.007 -1.048 0.033 0.224 1.100 0.227 0.94 0.31 0.93 0.99 4.86 Table B.211: Empirical variance and asymptotic variance of standard deviation of x (σˆx ) when ρ = 0.8 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 1.082 1.188 1.030 1.040 1.244 0.975 0.3 1.082 1.485 1.078 1.040 1.784 0.978 0.5 1.082 2.142 1.039 1.040 2.448 0.946 0.1 0.533 0.596 0.612 0.510 0.609 0.498 0.3 0.533 0.768 0.506 0.510 0.869 0.496 0.5 0.533 1.284 0.441 0.510 1.177 0.486 0.1 0.105 0.118 0.113 0.100 0.120 0.101 0.3 0.105 0.177 0.106 0.100 0.170 0.101 0.5 0.105 0.658 0.106 0.100 0.228 0.100 207 Table B.212: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of x (σˆx ) when ρ = 0.8 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.092 0.279 -0.120 1.024 1.150 0.995 0.93 0.96 0.92 1.03 1.16 0.3 -0.092 1.013 -0.164 1.024 1.676 1.003 0.93 0.95 0.92 1.02 1.67 0.5 -0.092 1.466 -0.329 1.024 2.144 1.027 0.93 0.93 0.91 1.00 2.09 0.1 -0.046 0.330 -0.056 0.715 0.847 0.708 0.94 0.94 0.92 1.01 1.20 0.3 -0.046 1.075 -0.062 0.715 1.423 0.707 0.94 0.85 0.94 1.01 2.01 0.5 -0.046 1.497 -0.160 0.715 1.848 0.716 0.94 0.76 0.95 1.00 2.58 0.1 -0.008 0.376 0.019 0.317 0.511 0.318 0.94 0.82 0.94 1.00 1.61 0.3 -0.008 1.120 0.038 0.317 1.193 0.320 0.94 0.20 0.95 0.99 3.73 0.5 -0.008 1.517 0.001 0.317 1.591 0.316 0.94 0.21 0.93 1.00 5.03 Table B.213: Empirical variance and asymptotic variance of standard deviation of y (σˆy ) when ρ = 0.8 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.274 0.306 0.196 0.263 0.315 0.246 0.3 0.274 0.376 0.259 0.263 0.453 0.245 0.5 0.274 0.544 0.272 0.263 0.618 0.240 0.1 0.127 0.142 0.138 0.128 0.153 0.125 0.3 0.127 0.185 0.121 0.128 0.218 0.125 0.5 0.127 0.322 0.114 0.128 0.296 0.123 0.1 0.025 0.027 0.027 0.025 0.030 0.025 0.3 0.025 0.041 0.025 0.025 0.042 0.025 0.5 0.025 0.157 0.023 0.025 0.057 0.025 208 Table B.214: Bias, RMSE, 95% CP, and relative RMSE of standard deviation of y (σˆy ) when ρ = 0.8 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.021 0.169 -0.041 0.513 0.586 0.498 0.93 0.95 0.95 1.03 1.18 0.3 -0.021 0.546 -0.073 0.513 0.866 0.501 0.93 0.94 0.92 1.02 1.73 0.5 -0.021 0.759 -0.130 0.513 1.093 0.507 0.93 0.91 0.89 1.01 2.16 0.1 -0.016 0.174 -0.012 0.358 0.428 0.354 0.94 0.96 0.92 1.01 1.21 0.3 -0.016 0.546 -0.021 0.358 0.719 0.354 0.94 0.85 0.94 1.01 2.03 0.5 -0.016 0.763 -0.060 0.358 0.937 0.355 0.94 0.71 0.94 1.01 2.64 0.1 -0.006 0.186 0.010 0.159 0.254 0.159 0.96 0.83 0.96 1.00 1.60 0.3 -0.006 0.558 0.019 0.159 0.594 0.160 0.96 0.21 0.97 0.99 3.72 0.5 -0.006 0.756 0.012 0.159 0.793 0.159 0.96 0.19 0.98 1.00 4.99 Table B.215: Empirical variance and asymptotic variance of correlation (ρ̂) when ρ = 0.8 over different interval censoring proportions of 0.1, 0.3, and 0.5 and sample sizes 50, 100, and 500 n 50 100 500 p Empirical Variance Asymptotic Variance CLS Naive PM CLS Naive PM 0.1 0.003 0.003 0.004 0.003 0.003 0.004 0.3 0.003 0.003 0.004 0.003 0.003 0.003 0.5 0.003 0.003 0.004 0.003 0.003 0.004 0.1 0.001 0.001 0.002 0.001 0.001 0.001 0.3 0.001 0.001 0.002 0.001 0.001 0.002 0.5 0.001 0.002 0.002 0.001 0.002 0.002 0.1 0.000 0.000 0.000 0.000 0.000 0.000 0.3 0.000 0.000 0.000 0.000 0.000 0.000 0.5 0.000 0.001 0.000 0.000 0.000 0.000 209 Table B.216: Bias, RMSE, 95% CP, and relative RMSE of correlation (ρ̂) when ρ = 0.8 over sample sizes 50, 100, and 500 and interval censoring proportions of 0.1, 0.3, and 0.5 n 50 100 500 p Bias RMSE CP RRMSE CLS Naive PM CLS Naive PM CLS Naive PM CLS/PM Naive/PM 0.1 -0.005 0.009 -0.011 0.055 0.055 0.060 0.94 0.90 0.92 0.91 0.91 0.3 -0.005 0.034 -0.025 0.055 0.064 0.063 0.94 0.79 0.94 0.88 1.02 0.5 -0.005 0.049 -0.037 0.055 0.074 0.071 0.94 0.73 0.93 0.78 1.06 0.1 -0.003 0.010 -0.010 0.038 0.039 0.039 0.94 0.91 0.90 0.97 1.01 0.3 -0.003 0.035 -0.015 0.038 0.050 0.042 0.94 0.75 0.92 0.90 1.21 0.5 -0.003 0.049 -0.023 0.038 0.062 0.046 0.94 0.61 0.92 0.81 1.35 0.1 -0.000 0.013 -0.006 0.017 0.022 0.018 0.94 0.82 0.88 0.94 1.18 0.3 -0.000 0.038 -0.010 0.017 0.040 0.020 0.94 0.33 0.88 0.87 2.04 0.5 -0.000 0.052 -0.012 0.017 0.055 0.021 0.94 0.23 0.88 0.82 2.60 210