Simulation Study. A simulation study is performed to compare the proposed adjusted degree of distinguisha- bility with the classical one. It is also aimed to develop a table to interpret the adjusted degree of distinguishability. × To generate 2 2 contingency tables, we used the method presented by Xxxxxx and Xxxx [8]. Bivariate standard normal distribution is used. At the ﬁrst step, two identically independently distributed random variables (X1 and X2) are generated. Equations (4.1) and (4.2) is used to generate two random variables (X and Y ) from bivariate normal distribution with certain correlation (ρ).

Simulation Study. I conducted four sets of real-data based simulation studies to demonstrate advantages of IPBT over existing methods. 1) In the first simulation study, I used 566 normal solid tissue microarray datasets obtained by Affymetrix GeneChip U133A from the global gene expression map to show a general trend between mean value and SD for genes in microarray. All the following simulation are generated with the parameters obtained from these 566 normal samples. We also show different SD estimates from different methods versus their truth to illustrate the over-shrinkage phenomenon and how IPBT can avoid the over-shrinkage. 2) In the second set of simulation, I show the false discovery rates (FDR) and receiver operating characteristic (ROC) curves for IPBT and competing methods. I also show the consistency of IPBT and other existing methods on independent datasets 3) In the last simulation, I show that IPBT can be robust even if the historical data has some noise.

Simulation Study. I conducted two sets of real-data based simulation studies to demonstrate the advantages of our new approaches. 1) In the first set of simulations, I use 566 normal solid tissue microarray datasets obtained by Affymetrix GeneChip U133A from the global gene expression map to show the correlation between SD estimates and the true SDs. All the following simulation are generated with the parameters obtained from these 566 normal samples. We also show that GDM is a good indicator for group dividing by simulation. 2) In the second set of simulation, I show the false discovery rates (FDR) and Receiver operating characteristic (ROC) curves for our new approaches are almost as good as IPBT and outperform all other competing methods. I also show that our new approaches could be more robust than IPBT when historical data does not have high quality.

Simulation Study. I conducted two sets of data based simulation studies to illustrate the consistency of gene panels and its applications to DE gene detection.

Simulation Study. In this section we describe some simulation experiments carried out with the following purposes: (a) to check whether taking into account the spatial correlation between small areas in the model improves the precision of small area estimators; (b) to study the small-sample behavior of the different MSE estimators introduced in this chapter, for different values of the spatial correlation parameter ρ and for different patterns of sampling variances ψd; (c) to analyze the robustness of the proposed bootstrap procedures to non-normality of the random effects and errors. × u ∈ { } The experiments are based on a real population, the map of the D = 287 municipalities (small areas) of Tuscany. We considered a model with p = 2, that is, one explanatory variable and a constant, with an D 2 design matrix X = [1D x], where 1D is a column vector of ones of size D and x = (x1, . . . , xD)′ contains the values of the explanatory variable. These values xd were generated from a uniform distri- bution in the interval (0, 1). The true model coefficients were β = (1, 2)′, the random effects variance σ2 = 1 and the spatial correlation parameter ρ 0.25, 0.5, 0.75 . The matrix of sampling variances ψ = diag(ψ1, . . . , ψD) was taken as ψd = 0.7 for 1 d 60; ψd = 0.6 for 61 d 120; ψd = 0.5 for ≤ ≤ ≤ ≤ ≤ ≤ 121 d 180; ψd = 0.4 for 181 d 240 and finally ψd = 0.3 for 241 d 287 (see Xxxxx et al. ×

Simulation Study. In this Section we study the gain obtained in the prediction of direct estimators of the FGT poverty measures (for α = 0), in the sense of mean squared error of small area predictors, when considering three different approaches in the computation of matrix W in (4.5). The first one is the classical approach, where W is a typicality matrix, whereas the second and third approaches consist in implementing the techniques described in Sections 5.2 and 5.3, respectively. We start by describing the data to be used in the model described in Section 4.2. They consist of official data from the Spanish Survey of Income and Living Conditions corresponding to year 2006 for D = 51 Spanish provinces (the small areas). The response variable is the direct estimator of the FGT poverty measure (for α = 0), that is the proportion of poor in the area. The auxiliary covariates are the intercept and the following proportions (in the area) of Spanish people, people of ages from 16 to 24, from 25 to 49, from 50 to 64, equal or greater than 65, people with no studies up to primary studies, Graduate people, employees, unemployed people, inactive people. We have selected from the Instituto Nacional xx Xxxxx´ıstica website (xxxx://xxx.xxx.xx), the more relevant socioeconomic variables related with poverty, being the unemployment rate and share of illiterate population over 16 years old. These variables have been measured in the D = 51 provinces from × 1991 to 2005 (J = 15 years). Therefore, in practice we have two matrices, X1 and X2 of size 51 15. × In order to compute matrix W with the multivariate approach of Section 5.2, we only consider the information contained in J-th columns of X1 and X2, which leads to a matrix of size 51 2. We call WM the proximity matrix computed with the methodology described in Section 5.2. To compute matrix W using the functional approach of Section 5.3, we have obtained two semi- metrics D(2) and D(2), one for each data set (see Ferraty and Xxxx (2006)), for q = 4 functional principal 1,q 2,q components, since q = 4 is enough to collect the most part of the observed variability. Finally, in order to obtain a square matrix of joint distances from the previous two, we have used the related metric scaling technique, introduced by Xxxxxxx and Fortiana (1998), which provides a joint metric from different metrics on the same individuals, taking into account the possible redundant information that can be added simply by adding distance matrices. We call WF the...

Simulation Study. To examine the validity of our hybrid algorithm, we create a dataset according to the generative process of LDA-MLC with the number of documents, t = 300, = 1,2, … ,12, the size of vocabulary, = 100, the number of tokens in each document d, () ~ Unif (10,90), the number of topics, = 5, the concentration parameter for topic distributions of documents, 0 = 5, and the symmetric Dirichlet parameter for word distributions of topics, = 0.5. Two changepoints are assumed to occur at the beginning of = 5 and = 9. The true values and the parameter estimates under each model specification are shown in Table A1. The hybrid algorithm is conducted for 5,000 iterations after 5,000 burn-in periods. Since we estimate by ML, report the sample standard errors to show its stability after the burn-in period (Ko et al. 2015). [Insert Table A1 Here] LDA-MLC correctly identifies the two changepoints that we preset. As shown in Table A1, nearly all the true values are contained within the 95% confidence interval for the LDA-MLC, demonstrating our ability to recover the true parameter values.

Simulation Study. The BioCycle Study, conducted from 2005 to 2007, followed premenopausal women from Western New York State for one or two complete menstrual cycles. Regularly menstruating women not currently taking oral contraceptives were eligible for participation. 259 women between the ages of 18 and 44 completed the study. Data collected during the study included age (years) and BMI (kg/m2), as well as serum estrodial, vitamin E, and HDL levels, which were measured on the 22nd day of a participant’s menstrual cycle. Participant BMI was right-skewed, with values ranging from 16.1 to 35.0, with an average body mass index of

Simulation Study. For each of the simulation scenarios, 5000 simulations were performed in R. Datasets from the first two simulation studies were simulated to resemble actual motivating data described in Section 3.2, with sample size N = 672. Independent predictor variables were generated to mimic age (years), smoking status (yes/no), race (1 = white / 2 = black), and SA status (yes/no), and the outcome variable was generated to resemble the cytokine MCP1 (µg/mL) based on a lognormal regression against those predictors. Age was simulated as a normal random variable with mean 26.6 and standard deviation 6.4, then rounded to the nearest whole number (this permits the formation of x-homogeneous pools when average pool size is small). Smoking status, race, and SA status were simulated as Bernoulli random variables with probabilities 0.47, 0.28, and 0.46, respectively. The outcome, MCP1, was generated under a lognormal distribution such that E[log(MCP1)|X] = —2.48+0.017(Age)+ 0.007(Smoking Status) — 0.388(Race) + 0.132(SA) and V ar[log(MCP1)|X] = 1.19. In the first study, we assess each of the proposed analytical strategies when applied to x-homogeneous pools (n = 336) mimicking data from the CPP substudy, and in the next study we compare estimate precision from the various pooling strategies applied to the same generated datasets, comparing k-means clustering to random pooling and selection when x-homogeneous pools cannot be formed (n = 112). The last two simulation studies were developed to assess performance of the analytical methods in additional scenarios. First, we generate a dataset such that application of all proposed methods (excluding the Naive Model) is feasible and theoretically justified. Specifically, pools were formed x-homogeneously on the covariates (to justify analysis under the Approximate Model) with a maximum pool size of 2 (to enable application of the Convolution Method). In the first two simulation studies, the nature of the simulated data precluded formation of pools with both of these characteristics. The final simulation demonstrates a scenario in which the Approximate Model fails and the Convolution Method falters, to caution against analysis via the former when pools are not x-homogeneous, and via the latter (even for pools of maximum size 2) when the convolution integral may be poorly behaved.

Simulation Study. In the simulation, we set the numbers of subjects, raters and time points as I = 100, J = 30 and Ti = 5 for any i = 1, . . . , I. We will first demonstrate our approach in Section 3.1 based on one simulated dataset for each setup. We will also present the averages of parameter estimates based on 1000 Monte Carlo replicates. In Section 3.2, we will compare our approach with approaches that do not account for the rater’s effect. The GLMM fitting is implemented by PROC GLIMMIX in SAS 9.4[35].

Simulation Study Sample Clauses

Filter & Search

Related Clauses

Parent Clauses

Sub-Clauses