Background : Mammographic (or breast) density is an established risk factor for breast cancer. There are a variety of approaches to measurement including quantitative, semi-automated and automated approaches. We present a new automated measure, AutoCumulus, learnt from applying deep learning to semi-automated measures. Methods: We used mammograms of 9,057 population-screened women in the BRAIx study for which semi-automated measurements of mammographic density had been made by experienced readers using the CUMULUS software. The dataset was split into training, testing, and validation sets (80%, 10%, 10%, respectively). We applied a deep learning regression model (fine-tuned ConvNeXtSmall) to estimate percentage density and assessed performance by the correlation between estimated and measured percent density and a Bland-Altman plot. The automated measure was tested on an independent CSAW-CC dataset in which density had been measured using the LIBRA software, comparing measures for left and right breasts, sensitivity for high sensitivity, and areas under the receiver operating characteristic curve (AUCs). Results: Based on the testing dataset, the correlation in percent density between the automated and human measures was 0.95, and the differences were only slightly larger for women with higher density. Based on the CSAW-CC dataset, AltoCumulus outperformed LIBRA in correlation between left and right breast (0.95 versus 0.79; P<0.001), specificity for 95% sensitivity (13% versus 10% (P<0.001)), and AUC (0.638 cf. 0.597; P<0.001). Conclusion: We have created an automated measure of mammographic density that is accurate and gives superior performance on repeatability within a woman, and for prediction of interval cancers, than another well-established automated measure.
In Australia, there is a high burden of acute rheumatic fever (ARF) among Aboriginal and Torres Strait Islander peoples. Clinical diagnostic criteria can result in a diagnosis of ‘definite’, ‘probable’ or ‘possible’ ARF and outcomes range from recovery to severe rheumatic heart disease (RHD). We compared outcomes by ARF diagnosis, where the main outcome was defined as disease progression from: possible to probable ARF, definite ARF or RHD; probable to definite ARF or RHD; or definite ARF to definite ARF recurrence or RHD. Data were extracted from the Northern Territory RHD register for Indigenous Australians with an initial diagnosis of ARF during the 5.5-year study period (01/01/2013–30/06/2019). Descriptive statistics were used to describe cohort characteristics, probability of survival, and cumulative incidence risk of disease progression. Cox proportional hazards regression was used to determine whether time to disease progression differed according to ARF diagnosis. Sub-analyses on RHD outcome, clinical manifestations, and antibiotic adherence were also performed. In total there were 913 cases with an initial ARF diagnosis. Of these, 92 (13%) experienced disease progression. The probability of disease progression significantly differed between ARF diagnoses (p = 0.0043; log rank test). Cumulative incidence risk of disease progression at 5.5 years was 33.6% (95% CI 23.6–46.2) for definite, 13.5% (95% CI 8.8–20.6) for probable and 11.4% (95% CI 6.0–21.3) for possible ARF. Disease progression was 2.19 times more likely in those with definite ARF than those with possible ARF (p = 0.026). Progression to RHD was reported in 52/732 (7%) of ARF cases with normal baseline echocardiography. There was a significantly higher risk of progression from no RHD to RHD if the initial diagnosis was definite compared to possible ARF (p<0.001). These data provide a useful way to stratify risk and guide prognosis for people diagnosed with ARF and can help inform practice.
BACKGROUND Cirrus is an automated risk predictor for breast cancer that comprises texture-based mammographic features and is mostly independent of mammographic density. We investigated genetic and environmental variance of variation in Cirrus. METHODS We measured Cirrus for 3195 breast-cancer-free participants, including 527 pairs of monozygotic (MZ) twins, 271 pairs of dizygotic (DZ) twins, and 1599 siblings of twins. Multivariate normal models were used to estimate the variance and familial correlations of age-adjusted Cirrus as a function of age. The classic twin model was expanded to allow the shared environment effects to differ by zygosity. The single-nucleotide polymorphism (SNP)-based heritability was estimated for a subset of 2356 participants. RESULTS There was no evidence that the variance or familial correlations depended on age. The familial correlations were 0.52(standard error[SE]=0.03) for MZ pairs and 0.16(SE=0.03) for DZ and non-twin sister pairs combined. Shared environmental factors specific to MZ pairs accounted for 20% of the variance. Additive genetic factors accounted for 32%(SE=5%) of the variance, consistent with the SNP-based heritability of 36%(SE=16%). CONCLUSIONS Cirrus is substantially familial due to genetic factors and an influence of shared environmental factors that was evident for MZ twin pairs only. The latter could be due to non-genetic factors operating in utero or in early life that are shared by MZ twins. IMPACT Early-life factors shared more by MZ pairs than DZ/non-twin sister pairs, could play a role in the variation in Cirrus, consistent with early life being recognised as a critical window of vulnerability to breast carcinogens.
Background Mammogram risk scores based on texture and density defined by different brightness thresholds are associated with breast cancer risk differently and could reveal distinct information about breast cancer risk. We aimed to investigate causal relationships between these intercorrelated mammogram risk scores to determine their relevance to breast cancer aetiology. Methods We used digitised mammograms for 371 monozygotic twin pairs, aged 40–70 years without a prior diagnosis of breast cancer at the time of mammography, from the Australian Mammographic Density Twins and Sisters Study. We generated normalised, age-adjusted, and standardised risk scores based on textures using the Cirrus algorithm and on three spatially independent dense areas defined by increasing brightness threshold: light areas, bright areas, and brightest areas. Causal inference was made using the Inference about Causation from Examination of FAmilial CONfounding (ICE FALCON) method. Results The mammogram risk scores were correlated within twin pairs and with each other ( r = 0.22–0.81; all P < 0.005). We estimated that 28–92% of the associations between the risk scores could be attributed to causal relationships between the scores, with the rest attributed to familial confounders shared by the scores. There was consistent evidence for positive causal effects: of Cirrus, light areas, and bright areas on the brightest areas (accounting for 34%, 55%, and 85% of the associations, respectively); and of light areas and bright areas on Cirrus (accounting for 37% and 28%, respectively). Conclusions In a mammogram, the lighter (less dense) areas have a causal effect on the brightest (highly dense) areas, including through a causal pathway via textural features. These causal relationships help us gain insight into the relative aetiological importance of different mammographic features in breast cancer. For example our findings are consistent with the brightest areas being more aetiologically important than lighter areas for screen-detected breast cancer; conversely, light areas being more aetiologically important for interval breast cancer. Additionally, specific textural features capture aetiologically independent breast cancer risk information from dense areas. These findings highlight the utility of ICE FALCON and family data in decomposing the associations between intercorrelated disease biomarkers into distinct biological pathways.
BACKGROUND DEPendency of association on the number of Top Hits (DEPTH) is an approach to identify candidate susceptibility regions by considering the risk signals from overlapping groups of sequential variants across the genome. METHODS We conducted a DEPTH analysis using a sliding window of 200 SNPs to colorectal cancer (CRC) data from the Colon Cancer Family Registry (CCFR) (5,735 cases and 3,688 controls), and GECCO (8,865 cases and 10,285 controls) studies. A DEPTH score >1 was used to identify candidate susceptibility regions common to both studies. We compared DEPTH results against those from conventional GWAS analyses of these two studies as well as against 132 published susceptibility regions. RESULTS Initial DEPTH analysis revealed 2,622 (CCFR) and 3,686 (GECCO) candidate susceptibility regions, of which 569 were common to both studies. Bootstrapping revealed 40 and 49 candidate susceptibility regions in the CCFR and GECCO data sets, respectively. Notably, DEPTH identified at least 82 regions that would not be detected using conventional GWAS methods, nor had they been identified by previous CRC GWASs. We found four reproducible candidate susceptibility regions (2q22.2, 2q33.1, 6p21.32, 13q14.3). The highest DEPTH scores were in the HLA locus at 6p21 where the strongest associated SNPs were rs762216297, rs149490268, rs114741460, and rs199707618 for the CCFR data, and rs9270761 for the GECCO data. CONCLUSIONS DEPTH can identify candidate susceptibility regions for CRC not identified using conventional analyses of larger datasets. IMPACT DEPTH has potential as a powerful complementary tool to conventional GWAS analyses for discovering susceptibility regions within the genome.
Abstract Background The extent to which known and unknown factors explain how much people of the same age differ in disease risk is fundamental to epidemiology. Risk factors can be correlated in relatives, so familial aspects of risk (genetic and non-genetic) must be considered. Development We present a unifying model (VALID) for variance in risk, with risk defined as log(incidence) or logit(cumulative incidence). Consider a normally distributed risk score with incidence increasing exponentially as the risk increases. VALID’s building block is variance in risk, Δ2, where Δ = log(OPERA) is the difference in mean between cases and controls and OPERA is the odds ratio per standard deviation. A risk score correlated r between a pair of relatives generates a familial odds ratio of exp(rΔ2). Familial risk ratios, therefore, can be converted into variance components of risk, extending Fisher’s classic decomposition of familial variation to binary traits. Under VALID, there is a natural upper limit to variance in risk caused by genetic factors, determined by the familial odds ratio for genetically identical twin pairs, but not to variation caused by non-genetic factors. Application For female breast cancer, VALID quantified how much variance in risk is explained—at different ages—by known and unknown major genes and polygenes, non-genomic risk factors correlated in relatives, and known individual-specific factors. Conclusion VALID has shown that, while substantial genetic risk factors have been discovered, much is unknown about genetic and familial aspects of breast cancer risk especially for young women, and little is known about individual-specific variance in risk.
Background: Outcomes after acute rheumatic fever (ARF) diagnosis are variable, ranging from recovery to development of severe rheumatic heart disease (RHD). There is no diagnostic test. Evaluation using the Australian clinical diagnostic criteria can result in a diagnosis of definite, probable or possible ARF. The possible category was introduced in 2013 in Australias Northern Territory (NT). Our aim was to compare longitudinal outcomes after a diagnosis of definite, probable or possible ARF. Methods: We extracted data from the NT RHD register for Indigenous Australians with an initial diagnosis of ARF during the 5.5-year study period (01/01/2013 - 30/06/2019). Descriptive statistics were used to describe the demographic and clinical characteristics at initial ARF diagnosis. Kaplan-Meier curves were used to assess the probability of survival free of disease progression and the cumulative incidence risk at each year since initial diagnosis was calculated. Cox proportional hazards regression was used to determine whether time to disease progression differed according to ARF diagnosis and whether progression was associated with specific predictors at diagnosis. A multinomial logistic regression model was performed to assess whether ARF diagnosis was associated with RHD outcome and to assess associations between ARF diagnosis and clinical manifestations. A generalised linear mixed model (GLMM) was developed to assess any differences in the long-term antibiotic adherence between ARF diagnosis categories and to examine longitudinal trends in adherence. Results: There were 913 initial ARF cases, 732 with normal baseline echocardiography. Of these, 92 (13%) experienced disease progression: definite ARF 61/348 (18%); probable ARF 20/181 (11%); possible ARF 11/203 (5%). The proportion of ARF diagnoses that were uncertain (i.e. possible or probable) increased over time, from 22/78 (28%) in 2013 to 98/193 (51%) in 2018. Cumulative incidence risk of any disease progression at 5.5 years was 33.6 (23.6-46.2) for definite ARF, 13.5 (8.8-20.6) for probable and 11.4% (95% CI 6.0-21.3) for possible ARF. The probability of disease-free survival was lowest for definite ARF and highest for possible ARF (p=0.004). Cox proportional hazards regression indicated that disease progression was 2.19 times more likely in those with definite ARF than those with possible ARF (p=0.026). Progression to RHD was reported in 37/348 (11%) definite ARF, 10/181 (6%) probable ARF, and 5/203 (2%) possible ARF. The multinomial logistic regression model demonstrated a significantly higher risk of progression from no RHD to RHD if the initial diagnosis was definite compared to possible ARF (p<0.001 for both mild and moderate-severe RHD outcomes). The GLMM estimated that patients with definite ARF had a significantly higher adherence to antibiotic prophylaxis compared with probable ARF (p=0.024). Conclusion: These data indicate that the ARF diagnostic categories are being applied appropriately, are capturing more uncertain cases over time, provide a useful way to stratify risk and guide prognosis, and can help inform practice. Possible ARF is not entirely benign; some cases progress to RHD.
Supplemental material is available for this article. Keywords: Mammography, Screening, Convolutional Neural Network (CNN) Published under a CC BY 4.0 license. See also the commentary by Cadrin-Chênevert in this issue.
Abstract Background Glioma accounts for approximately 80% of malignant adult brain cancer and its most common subtype, glioblastoma, has one of the lowest 5-year cancer survivals. Fifty risk-associated variants within 34 glioma genetic risk regions have been found by genome-wide association studies (GWAS) with a sex difference reported for 8q24.21 region. We conducted an Australian GWAS by glioma subtype and sex. Methods We analyzed genome-wide data from the Australian Genomics and Clinical Outcomes of Glioma (AGOG) consortium for 7 573 692 single nucleotide polymorphisms (SNPs) for 560 glioma cases and 2237 controls of European ancestry. Cases were classified as glioblastoma, non-glioblastoma, astrocytoma or oligodendroglioma. Logistic regression analysis was used to assess the associations of SNPs with glioma risk by subtype and by sex. Results We replicated the previously reported glioma risk associations in the regions of 2q33.3 C2orf80, 2q37.3 D2HGDH, 5p15.33 TERT, 7p11.2 EGFR, 8q24.21 CCDC26, 9p21.3 CDKN2BAS, 11q21 MAML2, 11q23.3 PHLDB1, 15q24.2 ETFA, 16p13.3 RHBDF1, 16p13.3 LMF1, 17p13.1 TP53, 20q13.33 RTEL, and 20q13.33 GMEB2 (P < .05). We also replicated the previously reported sex difference at 8q24.21 CCDC26 (P = .0024) with the association being nominally significant for both sexes (P < .05). Conclusions Our study supports a stronger female risk association for the region 8q24.21 CCDC26 and highlights the importance of analyzing glioma GWAS by sex. A better understanding of sex differences could provide biological insight into the cause of glioma with implications for prevention, risk prediction and treatment.
Introduction: In Australia and New Zealand, liver allocation is needs based (based on model for end-stage liver disease score). An alternative allocation system is a transplant benefit-based model. Transplant benefit is quantified by complex waitlist and transplant survival prediction models. Research Questions: To validate the UK transplant benefit score in an Australia and New Zealand population. Design: This study analyzed data on listings and transplants for chronic liver disease between 2009 and 2018, using the Australia and New Zealand Liver and Intestinal Transplant Registry. Excluded were variant syndromes, hepatocellular cancer, urgent listings, pediatric, living donor, and multi-organ listings and transplants. UK transplant benefit waitlist and transplant benefit score were calculated for listings and transplants, respectively. Outcomes were time to waitlist death and time to transplant failure. Calibration and discrimination were assessed with Kaplan–Meier analysis and C-statistics. Results: There were differences in the UK and Australia and New Zealand listing, transplant, and donor populations including older recipient age, higher recipient and donor body mass index, and higher incidence of hepatitis C in the Australia and New Zealand population. Waitlist scores were calculated for 2241 patients and transplant scores were calculated for 1755 patients. The waitlist model C-statistic at 5 years was 0.70 and the transplant model C-statistic was 0.56, with poor calibration of both models. Conclusion: The UK transplant benefit score model performed poorly, suggesting that UK benefit-based allocation would not improve overall outcomes in Australia and New Zealand. Generalizability of survival prediction models was limited by differences in transplant populations and practices.
Polygenic risk scores (PRSs) are a promising approach to accurately predict an individual’s risk of developing disease. The area under the receiver operating characteristic curve (AUC) of PRSs in their population are often only reported for models that are adjusted for age and sex, which are known risk factors for the disease of interest and confound the association between the PRS and the disease. This makes comparison of PRS between studies difficult because the genetic effects cannot be disentangled from effects of age and sex (which have a high AUC without the PRS). In this study, we used data from the UK Biobank and applied the stacked clumping and thresholding method and a variation called maximum clumping and thresholding method to develop PRSs to predict coronary artery disease, hypertension, atrial fibrillation, stroke and type 2 diabetes. We created case-control training datasets in which age and sex were controlled by design. We also excluded prevalent cases to prevent biased estimation of disease risks. The maximum clumping and thresholding PRSs required many fewer single-nucleotide polymorphisms to achieve almost the same discriminatory ability as the stacked clumping and thresholding PRSs. Using the testing datasets, the AUCs for the maximum clumping and thresholding PRSs were 0.599 (95% confidence interval [CI]: 0.585, 0.613) for atrial fibrillation, 0.572 (95% CI: 0.560, 0.584) for coronary artery disease, 0.585 (95% CI: 0.564, 0.605) for type 2 diabetes, 0.559 (95% CI: 0.550, 0.569) for hypertension and 0.514 (95% CI: 0.494, 0.535) for stroke. By developing a PRS using a dataset in which age and sex are controlled by design, we have obtained true estimates of the discriminatory ability of the PRSs alone rather than estimates that include the effects of age and sex.
The horseshoe prior is known to possess many desirable properties for Bayesian estimation of sparse parameter vectors, yet its density function lacks an analytic form. As such, it is challenging to find a closed-form solution for the posterior mode. Conventional horseshoe estimators use the posterior mean to estimate the parameters, but these estimates are not sparse. We propose a novel expectation-maximisation (EM) procedure for computing the MAP estimates of the parameters in the case of the standard linear model. A particular strength of our approach is that the M-step depends only on the form of the prior and it is independent of the form of the likelihood. We introduce several simple modifications of this EM procedure that allow for straightforward extension to generalised linear models. In experiments performed on simulated and real data, our approach performs comparable, or superior to, state-of-the-art sparse estimation methods in terms of statistical performance and computational cost.
Nema pronađenih rezultata, molimo da izmjenite uslove pretrage i pokušate ponovo!
Ova stranica koristi kolačiće da bi vam pružila najbolje iskustvo
Saznaj više