not missing at random
Recently Published Documents


TOTAL DOCUMENTS

20
(FIVE YEARS 7)

H-INDEX

5
(FIVE YEARS 1)

Author(s):  
Jin Hyuk Lee ◽  
J. Charles Huber Jr.

Background: Multiple Imputation (MI) is known as an effective method for handling missing data in public health research. However, it is not clear that the method will be effective when the data contain a high percentage of missing observations on a variable. Methods: Using data from “Predictive Study of Coronary Heart Disease” study, this study examined the effectiveness of multiple imputation in data with 20% missing to 80% missing observations using absolute bias (|bias|) and Root Mean Square Error (RMSE) of MI measured under Missing Completely at Random (MCAR), Missing at Random (MAR), and Not Missing at Random (NMAR) assumptions. Results: The |bias| and RMSE of MI was much smaller than of the results of CCA under all missing mechanisms, especially with a high percentage of missing. In addition, the |bias| and RMSE of MI were consistent regardless of increasing imputation numbers from M=10 to M=50. Moreover, when comparing imputation mechanisms, MCMC method had universally smaller |bias| and RMSE than those of Regression method and Predictive Mean Matching method under all missing mechanisms. Conclusion: As missing percentages become higher, using MI is recommended, because MI produced less biased estimates under all missing mechanisms. However, when large proportions of data are missing, other things need to be considered such as the number of imputations, imputation mechanisms, and missing data mechanisms for proper imputation.


2020 ◽  
Vol 44 (2) ◽  
pp. 247-255
Author(s):  
Boris Blažinić ◽  
Lovorka Gotal Dmitrović ◽  
Marko Stojić

Competencies represent a dynamic combination of cognitive and metacognitive skills, knowledge and understanding, interpersonal and practical skills, and ethical values. Since there are many entities, as well as many activities between entities, according to system theory, the whole system belongs to complex systems. The paper develops a conceptual and computational model of interpersonal competences for the process of optimization and methodology design, using simulation modeling. The developed model enables: faster data collection, more accurate results, avoiding human error in data entry and processing, survey time can be measured and more easily restricted, NMAR (Not Missing at Random) data is avoided and socially desirable responses are more easily avoided.


Author(s):  
David Haziza ◽  
Sixia Chen ◽  
Yimeng Gao

Abstract In the presence of nonresponse, unadjusted estimators are vulnerable to nonresponse bias when the characteristics of the respondents differ from those of the nonrespondents. To reduce the bias, it is common practice to postulate a nonresponse model linking the response indicators and a set of fully observed variables. Estimated response probabilities are obtained by fitting the selected model, which are then used to adjust the base weights. The resulting estimator, referred to as the propensity score-adjusted estimator, is consistent provided the nonresponse model is correctly specified. In this article, we propose a weighting procedure that may improve the efficiency of propensity score estimators for survey variables identified as key variables by making a more extensive use of the auxiliary information available at the nonresponse treatment stage. Results from a simulation study suggest that the proposed procedure performs well in terms of efficiency when the data are missing at random and also achieves an efficient bias reduction when the data are not missing at random. We further apply our proposed methods to 2017–2018 National Health Nutrition and Examination Survey.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Alexandra Maertens ◽  
Vy P. Tran ◽  
Mikhail Maertens ◽  
Andre Kleensang ◽  
Thomas H. Luechtefeld ◽  
...  

Abstract Cancer is a comparatively well-studied disease, yet despite decades of intense focus, we demonstrate here using data from The Cancer Genome Atlas that a substantial number of genes implicated in cancer are relatively poorly studied. Those genes will likely be missed by any data analysis pipeline, such as enrichment analysis, that depends exclusively on annotations for understanding biological function. There is no indication that the amount of research - indicated by number of publications - is correlated with any objective metric of gene significance. Moreover, these genes are not missing at random but reflect that our information about genes is gathered in a biased manner: poorly studied genes are more likely to be primate-specific and less likely to have a Mendelian inheritance pattern, and they tend to cluster in some biological processes and not others. While this likely reflects both technological limitations as well as the fact that well-known genes tend to gather more interest from the research community, in the absence of a concerted effort to study genes in an unbiased way, many genes (and biological processes) will remain opaque.


2019 ◽  
Vol 80 (2) ◽  
pp. 389-398
Author(s):  
Tenko Raykov ◽  
Abdullah A. Al-Qataee ◽  
Dimiter M. Dimitrov

A procedure for evaluation of validity related coefficients and their differences is discussed, which is applicable when one or more frequently used assumptions in empirical educational, behavioral and social research are violated. The method is developed within the framework of the latent variable modeling methodology and accomplishes point and interval estimation of convergent and discriminant correlations as well as differences between them in cases of incomplete data sets with data not missing at random, nonnormality, and clustering effects. The procedure uses the full information maximum likelihood approach to model fitting and parameter estimation, does not assume availability of multiple indicators for underlying latent constructs, includes auxiliary variables, and accounts for within-group correlations on main response variables resulting from nesting effects involving studied respondents. The outlined procedure is illustrated on empirical data from a study using tertiary education entrance examination measures.


2018 ◽  
Vol 34 (1) ◽  
pp. 107-120 ◽  
Author(s):  
Phillip S. Kott ◽  
Dan Liao

Abstract When adjusting for unit nonresponse in a survey, it is common to assume that the response/nonresponse mechanism is a function of variables known either for the entire sample before unit response or at the aggregate level for the frame or population. Often, however, some of the variables governing the response/nonresponse mechanism can only be proxied by variables on the frame while they are measured (more) accurately on the survey itself. For example, an address-based sampling frame may contain area-level estimates for the median annual income and the fraction home ownership in a Census block group, while a household’s annual income category and ownership status are reported on the survey itself for the housing units responding to the survey. A relatively new calibration-weighting technique allows a statistician to calibrate the sample using proxy variables while assuming the response/ nonresponse mechanism is a function of the analogous survey variables. We will demonstrate how this can be done with data from the Residential Energy Consumption Survey National Pilot, a nationally representative web-and-mail survey of American households sponsored by the U.S. Energy Information Administration.


2018 ◽  
Vol 21 (01) ◽  
pp. 1850002 ◽  
Author(s):  
GUY KELMAN ◽  
ERAN MANES ◽  
MARCO LAMIERI ◽  
DAVID S. BRÉE

Many real-world networks are known to exhibit facts that counter our knowledge prescribed by the theories on network creation and communication patterns. A common prerequisite in network analysis is that information on nodes and links will be complete because network topologies are extremely sensitive to missing information of this kind. Therefore, many real-world networks that fail to meet this criterion under random sampling may be discarded.In this paper, we offer a framework for interpreting the missing observations in network data under the hypothesis that these observations are not missing at random. We demonstrate the methodology with a case study of a financial trade network, where the awareness of agents to the data collection procedure by a self-interested observer may result in strategic revealing or withholding of information. The non-random missingness has been overlooked despite the possibility of this being an important feature of the processes by which the network is generated. The analysis demonstrates that strategic information withholding may be a valid general phenomenon in complex systems. The evidence is sufficient to support the existence of an influential observer and to offer a compelling dynamic mechanism for the creation of the network.


Sign in / Sign up

Export Citation Format

Share Document