Simulation Study. A simulation study is performed to compare the proposed adjusted degree of distinguisha- bility with the classical one. It is also aimed to develop a table to interpret the adjusted degree of distinguishability. × To generate 2 2 contingency tables, we used the method presented by Xxxxxx and Xxxx [8]. Bivariate standard normal distribution is used. At the ﬁrst step, two identically independently distributed random variables (X1 and X2) are generated. Equations (4.1) and (4.2) is used to generate two random variables (X and Y ) from bivariate normal distribution with certain correlation (ρ).

Simulation Study. In order to study the performance of the cχ2 and normal distributions as approximations of the null distribution of the score statistic, we performed a simulation study. For sake of simplicity we used the data structure of our example of 33 families (see below). We generated 100,000 data sets of inde- pendently binomially distributed outcomes and 100,000 data sets of indepen- dently normally distributed outcomes. The score statistics were calculated using correlation structure (2.1) based on the coefficients of relationship. We also studied the performance of the distributions in a very small set of nine families. In table 2.1, the actual p-values corresponding to a nominal p-value of 0.05, 0.01, 0.001 and 0.0001 are given. The results were in favour of the cχ2 distri- bution for both binomially and normally distributed outcomes. Even for the set of nine families, the cχ2 distribution performed very well.

Simulation Study. The aim of the simulation study is to evaluate empirically the power of the score test Sˆp in comparison with the Pearson’s χ2, Zˆmax and Zˆclump tests. We generated at least 1000 replicates from the multinomial distributions ac- cording to the models described previously. Without loss of generality we assumed that the first or first two haplotypes are associated with the disease. The remaining haplotypes were equally frequent. We varied the number of variants m from 3 to 20. The p-values of the test statistics were calculated em- pirically by means of 0000 Xxxxx-Xxxxx permutations using a program based on the program Clump (Sham and Xxxxxx, 1995). We used a nominal p-value of 0.05. Type I error rate To verify whether Monte-Carlo yields the right type I error rate of these test statistics, data sets were generated under the null model (λ = 0) each time for markers with 5, 7, 9, 11, 16 and 20 alleles. The frequency of the first allele was set to 0.5, whereas the remaining alleles were equally frequent. The results are shown in Table 4.1. The type I error rate is approximately equal to the nominal rate for the score Sˆp, Xxxxxxx’x χ2, and Zˆclump tests, regardless of the number of alleles m at the marker locus, whereas the Zˆmax becomes somewhat conservative as the number of marker alleles m increases (Sham and Xxxxxx, 1995).

Simulation Study. By means of a simulation study, we first evaluated the type I error rate of the score statistic Sˆ 1 , Xxxxxxx’x chi square χ2, the likelihood ratio with equal m · · · weights LR, and the Terwilliger’s likelihood ratio with weights equal to pj’s TLR. For the score statistic we used the chi square distribution with one degree of freedom to approximate the distribution under the null hypothesis. For the LR and TLR statistics we used the 50:50 mixture of two chi squares with zero and one degree of freedom. We generated 10,000 samples of 200 case chromo- somes and 200 control chromosomes from the multinomial distributions with probabilities p1 pm for m equal to 4, 5, 8, 10, 15 and 20 haplotypes. Similar to the simulation described by Xxxxxxxxxxx, xxx frequency of the most common haplotype, p1, was set to 0.5, whereas the remaining haplotypes were equally frequent (0.5/(m — 1)). The results are shown in left columns of table 6.1. For all m, the type I error rates of the score statistic Sˆ 1 were maintained at the nominal error rate. For m < 10, the type I error rates of Xxxxxxx’x chi ≈

Simulation Study. The BioCycle Study, conducted from 2005 to 2007, followed premenopausal women from Western New York State for one or two complete menstrual cycles. Regularly menstruating women not currently taking oral contraceptives were eligible for participation. 259 women between the ages of 18 and 44 completed the study. Data collected during the study included age (years) and BMI (kg/m2), as well as serum estrodial, vitamin E, and HDL levels, which were measured on the 22nd day of a participant’s menstrual cycle. Participant BMI was right-skewed, with values ranging from 16.1 to 35.0, with an average body mass index of

Simulation Study. For each of the simulation scenarios, 5000 simulations were performed in R. Datasets from the first two simulation studies were simulated to resemble actual motivating data described in Section 3.2, with sample size N = 672. Independent predictor variables were generated to mimic age (years), smoking status (yes/no), race (1 = white / 2 = black), and SA status (yes/no), and the outcome variable was generated to resemble the cytokine MCP1 (µg/mL) based on a lognormal regression against those predictors. Age was simulated as a normal random variable with mean 26.6 and standard deviation 6.4, then rounded to the nearest whole number (this permits the formation of x-homogeneous pools when average pool size is small). Smoking status, race, and SA status were simulated as Bernoulli random variables with probabilities 0.47, 0.28, and 0.46, respectively. The outcome, MCP1, was generated under a lognormal distribution such that E[log(MCP1)|X] = —2.48+0.017(Age)+ 0.007(Smoking Status) — 0.388(Race) + 0.132(SA) and V ar[log(MCP1)|X] = 1.19. In the first study, we assess each of the proposed analytical strategies when applied to x-homogeneous pools (n = 336) mimicking data from the CPP substudy, and in the next study we compare estimate precision from the various pooling strategies applied to the same generated datasets, comparing k-means clustering to random pooling and selection when x-homogeneous pools cannot be formed (n = 112). The last two simulation studies were developed to assess performance of the analytical methods in additional scenarios. First, we generate a dataset such that application of all proposed methods (excluding the Naive Model) is feasible and theoretically justified. Specifically, pools were formed x-homogeneously on the covariates (to justify analysis under the Approximate Model) with a maximum pool size of 2 (to enable application of the Convolution Method). In the first two simulation studies, the nature of the simulated data precluded formation of pools with both of these characteristics. The final simulation demonstrates a scenario in which the Approximate Model fails and the Convolution Method falters, to caution against analysis via the former when pools are not x-homogeneous, and via the latter (even for pools of maximum size 2) when the convolution integral may be poorly behaved.

Simulation Study. To examine the validity of our hybrid algorithm, we create a dataset according to the generative process of LDA-MLC with the number of documents, t = 300, = 1,2, … ,12, the size of vocabulary, = 100, the number of tokens in each document d, () ~ Unif (10,90), the number of topics, = 5, the concentration parameter for topic distributions of documents, 0 = 5, and the symmetric Dirichlet parameter for word distributions of topics, = 0.5. Two changepoints are assumed to occur at the beginning of = 5 and = 9. The true values and the parameter estimates under each model specification are shown in Table A1. The hybrid algorithm is conducted for 5,000 iterations after 5,000 burn-in periods. Since we estimate by ML, report the sample standard errors to show its stability after the burn-in period (Ko et al. 2015). [Insert Table A1 Here] LDA-MLC correctly identifies the two changepoints that we preset. As shown in Table A1, nearly all the true values are contained within the 95% confidence interval for the LDA-MLC, demonstrating our ability to recover the true parameter values.

Simulation Study. In the simulation, we set the numbers of subjects, raters and time points as I = 100, J = 30 and Ti = 5 for any i = 1, . . . , I. We will first demonstrate our approach in Section 3.1 based on one simulated dataset for each setup. We will also present the averages of parameter estimates based on 1000 Monte Carlo replicates. In Section 3.2, we will compare our approach with approaches that do not account for the rater’s effect. The GLMM fitting is implemented by PROC GLIMMIX in SAS 9.4[35].

Simulation Study. In this section we describe some simulation experiments carried out with the following purposes: (a) to check whether taking into account the spatial correlation between small areas in the model improves the precision of small area estimators; (b) to study the small-sample behavior of the different MSE estimators introduced in this chapter, for different values of the spatial correlation parameter ρ and for different patterns of sampling variances ψd; (c) to analyze the robustness of the proposed bootstrap procedures to non-normality of the random effects and errors. × u ∈ { } The experiments are based on a real population, the map of the D = 287 municipalities (small areas) of Tuscany. We considered a model with p = 2, that is, one explanatory variable and a constant, with an D 2 design matrix X = [1D x], where 1D is a column vector of ones of size D and x = (x1, . . . , xD)′ contains the values of the explanatory variable. These values xd were generated from a uniform distri- bution in the interval (0, 1). The true model coefficients were β = (1, 2)′, the random effects variance σ2 = 1 and the spatial correlation parameter ρ 0.25, 0.5, 0.75 . The matrix of sampling variances ψ = diag(ψ1, . . . , ψD) was taken as ψd = 0.7 for 1 d 60; ψd = 0.6 for 61 d 120; ψd = 0.5 for ≤ ≤ ≤ ≤ ≤ ≤ 121 d 180; ψd = 0.4 for 181 d 240 and finally ψd = 0.3 for 241 d 287 (see Xxxxx et al. ×

Simulation Study. In this Section we study the gain obtained in the prediction of direct estimators of the FGT poverty measures (for α = 0), in the sense of mean squared error of small area predictors, when considering three different approaches in the computation of matrix W in (4.5). The first one is the classical approach, where W is a typicality matrix, whereas the second and third approaches consist in implementing the techniques described in Sections 5.2 and 5.3, respectively. We start by describing the data to be used in the model described in Section 4.2. They consist of official data from the Spanish Survey of Income and Living Conditions corresponding to year 2006 for D = 51 Spanish provinces (the small areas). The response variable is the direct estimator of the FGT poverty measure (for α = 0), that is the proportion of poor in the area. The auxiliary covariates are the intercept and the following proportions (in the area) of Spanish people, people of ages from 16 to 24, from 25 to 49, from 50 to 64, equal or greater than 65, people with no studies up to primary studies, Graduate people, employees, unemployed people, inactive people. We have selected from the Instituto Nacional xx Xxxxx´ıstica website (xxxx://xxx.xxx.xx), the more relevant socioeconomic variables related with poverty, being the unemployment rate and share of illiterate population over 16 years old. These variables have been measured in the D = 51 provinces from × 1991 to 2005 (J = 15 years). Therefore, in practice we have two matrices, X1 and X2 of size 51 15. × In order to compute matrix W with the multivariate approach of Section 5.2, we only consider the information contained in J-th columns of X1 and X2, which leads to a matrix of size 51 2. We call WM the proximity matrix computed with the methodology described in Section 5.2. To compute matrix W using the functional approach of Section 5.3, we have obtained two semi- metrics D(2) and D(2), one for each data set (see Ferraty and Xxxx (2006)), for q = 4 functional principal 1,q 2,q components, since q = 4 is enough to collect the most part of the observed variability. Finally, in order to obtain a square matrix of joint distances from the previous two, we have used the related metric scaling technique, introduced by Xxxxxxx and Fortiana (1998), which provides a joint metric from different metrics on the same individuals, taking into account the possible redundant information that can be added simply by adding distance matrices. We call WF the...

Simulation Study Sample Clauses

Filter & Search

Related Clauses

Parent Clauses

Sub-Clauses