Abstract Background We applied machine learning to find a novel breast cancer predictor based on information in a mammogram. Methods Using image-processing techniques, we automatically processed 46 158 analog mammograms for 1345 cases and 4235 controls from a cohort and case–control study of Australian women, and a cohort study of Japanese American women, extracting 20 textural features not based on pixel brightness threshold. We used Bayesian lasso regression to create individual- and mammogram-specific measures of breast cancer risk, Cirrus. We trained and tested measures across studies. We fitted Cirrus with conventional mammographic density measures using logistic regression, and computed odds ratios (OR) per standard deviation adjusted for age and body mass index. Results Combining studies, almost all textural features were associated with case–control status. The ORs for Cirrus measures trained on one study and tested on another study ranged from 1.56 to 1.78 (all P < 10−6). For the Cirrus measure derived from combining studies, the OR was 1.90 (95% confidence interval [CI] = 1.73 to 2.09), equivalent to a fourfold interquartile risk ratio, and was little attenuated after adjusting for conventional measures. In contrast, the OR for the conventional measure was 1.34 (95% CI = 1.25 to 1.43), and after adjusting for Cirrus it became 1.16 (95% CI = 1.08 to 1.24; P = 4 × 10−5). Conclusions A fully automated personal risk measure created from combining textural image features performs better at predicting breast cancer risk than conventional mammographic density risk measures, capturing half the risk-predicting ability of the latter measures. In terms of differentiating affected and unaffected women on a population basis, Cirrus could be one of the strongest known risk factors for breast cancer.
Abstract In this note, we develop a novel algorithm for generating random numbers from a distribution with a probability density function proportional to and Our algorithm is highly efficient and is based on rejection sampling where the envelope distribution is an appropriately chosen beta distribution. An example application illustrating how the new algorithm can be used to generate random correlation matrices is discussed.
Background Folate and other one-carbon metabolism nutrients are essential to enable DNA methylation to occur, but the extent to which their dietary intake influences methylation in adulthood is unclear. Objective We assessed associations between dietary intake of these nutrients and DNA methylation in peripheral blood, overall and at specific genomic locations. Design We conducted a cross-sectional study using baseline data and samples from 5186 adult participants in the Melbourne Collaborative Cohort Study (MCCS). Nutrient intake was estimated from a food-frequency questionnaire. DNA methylation was measured by using the Illumina Infinium HumanMethylation450 BeadChip array (HM450K). We assessed associations of intakes of folate, riboflavin, vitamins B-6 and B-12, methionine, choline, and betaine with methylation at individual cytosine-guanine dinucleotides (CpGs), and with median (genome-wide) methylation across all CpGs, CpGs in gene bodies, and CpGs in gene promoters. We also assessed associations with methylation at long interspersed nuclear element 1 (LINE-1), satellite 2 (Sat2), and Arthrobacter luteus restriction endonuclease (Alu) repetitive elements for a subset of participants. We used linear mixed regression, adjusting for age, sex, country of birth, smoking, energy intake from food, alcohol intake, Mediterranean diet score, and batch effects to assess log-linear associations with dietary intake of each nutrient. In secondary analyses, we assessed associations with low or high intakes defined by extreme quintiles. Results No evidence of log-linear association was observed at P < 10-7 between the intake of one-carbon metabolism nutrients and methylation at individual CpGs. Low intake of riboflavin was associated with higher methylation at CpG cg21230392 in the first exon of PROM1 (P = 5.0 × 10-8). No consistent evidence of association was observed with genome-wide or repetitive element measures of methylation. Conclusion Our findings suggest that dietary intake of one-carbon metabolism nutrients in adulthood, as measured by a food-frequency questionnaire, has little association with blood DNA methylation. An association with low intake of riboflavin requires replication in independent cohorts. This study was registered at http://www.clinicaltrials.gov as NCT03227003.
DNA methylation can mimic the effects of germline mutations in cancer predisposition genes. Recently, we identified twenty‐four heritable methylation marks associated with breast cancer risk. As breast and prostate cancer share genetic risk factors, including rare, high‐risk mutations (eg, in BRCA2), we hypothesized that some of these heritable methylation marks might also be associated with the risk of prostate cancer.
Background Clustering of breast and colorectal cancer has been observed within some families and cannot be explained by chance or known high-risk mutations in major susceptibility genes. Potential shared genetic susceptibility between breast and colorectal cancer, not explained by high-penetrance genes, has been postulated. We hypothesized that yet undiscovered genetic variants predispose to a breast-colorectal cancer phenotype. Methods To identify variants associated with a breast-colorectal cancer phenotype, we analyzed genome-wide association study (GWAS) data from cases and controls that met the following criteria: cases (n = 985) were women with breast cancer who had one or more first- or second-degree relatives with colorectal cancer, men/women with colorectal cancer who had one or more first- or second-degree relatives with breast cancer, and women diagnosed with both breast and colorectal cancer. Controls (n = 1769), were unrelated, breast and colorectal cancer-free, and age- and sex- frequency-matched to cases. After imputation, 6,220,060 variants were analyzed using the discovery set and variants associated with the breast-colorectal cancer phenotype at P<5.0E-04 (n = 549, at 60 loci) were analyzed for replication (n = 293 cases and 2,103 controls). Results Multiple correlated SNPs in intron 1 of the ROBO1 gene were suggestively associated with the breast-colorectal cancer phenotype in the discovery and replication data (most significant; rs7430339, Pdiscovery = 1.2E-04; rs7429100, Preplication = 2.8E-03). In meta-analysis of the discovery and replication data, the most significant association remained at rs7429100 (P = 1.84E-06). Conclusion The results of this exploratory analysis did not find clear evidence for a susceptibility locus with a pleiotropic effect on hereditary breast and colorectal cancer risk, although the suggestive association of genetic variation in the region of ROBO1, a potential tumor suppressor gene, merits further investigation.
This paper applies the minimum message length principle to inference of linear regression models with Student-t errors. A new criterion for variable selection and parameter estimation in Student-t regression is proposed. By exploiting properties of the regression model, we derive a suitable non-informative proper uniform prior distribution for the regression coefficients that leads to a simple and easy-to-apply criterion. Our proposed criterion does not require specification of hyperparameters and is invariant under both full rank transformations of the design matrix and linear transformations of the outcomes. We compare the proposed criterion with several standard model selection criteria, such as the Akaike information criterion and the Bayesian information criterion, on simulations and real data with promising results.
Global-local shrinkage hierarchies are an important innovation in Bayesian estimation. We propose the use of log-scale distributions as a novel basis for generating familes of prior distributions for local shrinkage hyperparameters. By varying the scale parameter one may vary the degree to which the prior distribution promotes sparsity in the coefficient estimates. By examining the class of distributions over the logarithm of the local shrinkage parameter that have log-linear, or sub-log-linear tails, we show that many standard prior distributions for local shrinkage parameters can be unified in terms of the tail behaviour and concentration properties of their corresponding marginal distributions over the coefficients $\beta_j$. We derive upper bounds on the rate of concentration around $|\beta_j|=0$, and the tail decay as $|\beta_j| \to \infty$, achievable by this wide class of prior distributions. We then propose a new type of ultra-heavy tailed prior, called the log-$t$ prior with the property that, irrespective of the choice of associated scale parameter, the marginal distribution always diverges at $\beta_j = 0$, and always possesses super-Cauchy tails. We develop results demonstrating when prior distributions with (sub)-log-linear tails attain Kullback--Leibler super-efficiency and prove that the log-$t$ prior distribution is always super-efficient. We show that the log-$t$ prior is less sensitive to misspecification of the global shrinkage parameter than the horseshoe or lasso priors. By incorporating the scale parameter of the log-scale prior distributions into the Bayesian hierarchy we derive novel adaptive shrinkage procedures. Simulations show that the adaptive log-$t$ procedure appears to always perform well, irrespective of the level of sparsity or signal-to-noise ratio of the underlying model.
Nema pronađenih rezultata, molimo da izmjenite uslove pretrage i pokušate ponovo!
Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo
Saznaj više