Tài liệu miễn phí Sinh học

Download Tài liệu học tập miễn phí Sinh học

Personalized and graph genomes reveal missing signal in epigenomic data

Epigenomic studies that use next generation sequencing experiments typically rely on the alignment of reads to a reference sequence. However, because of genetic diversity and the diploid nature of the human genome, we hypothesize that using a generic reference could lead to incorrectly mapped reads and bias downstream results.

4/6/2023 6:12:49 AM +00:00

Protection from DNA re-methylation by transcription factors in primordial germ cells and pre-implantation embryos can explain trans-generational epigenetic inheritance

A growing body of evidence suggests that certain epiphenotypes can be passed across generations via both the male and female germlines of mammals. These observations have been difficult to explain owing to a global loss of the majority of known epigenetic marks present in parental chromosomes during primordial germ cell development and after fertilization.

4/6/2023 6:12:40 AM +00:00

Whole-genome sequencing of glioblastoma reveals enrichment of non-coding constraint mutations in known and novel genes

Glioblastoma (GBM) has one of the worst 5-year survival rates of all cancers. While genomic studies of the disease have been performed, alterations in the non-coding regulatory regions of GBM have largely remained unexplored.

4/6/2023 6:12:26 AM +00:00

APEC: An accesson-based method for single-cell chromatin accessibility analysis

The development of sequencing technologies has promoted the survey of genomewide chromatin accessibility at single-cell resolution. However, comprehensive analysis of single-cell epigenomic profiles remains a challenge.

4/6/2023 6:12:18 AM +00:00

Sampling time-dependent artifacts in single-cell genomics studies

Robust protocols and automation now enable large-scale single-cell RNA and ATAC sequencing experiments and their application on biobank and clinical cohorts. However, technical biases introduced during sample acquisition can hinder solid, reproducible results, and a systematic benchmarking is required before entering large-scale data production.

4/6/2023 6:12:07 AM +00:00

Chromatin topology reorganization and transcription repression by PML-RARα in acute promyeloid leukemia

Acute promyeloid leukemia (APL) is characterized by the oncogenic fusion protein PML-RARα, a major etiological agent in APL. However, the molecular mechanisms underlying the role of PML-RARα in leukemogenesis remain largely unknown.

4/6/2023 6:11:57 AM +00:00

Gapless assembly of maize chromosomes using long-read technologies

Creating gapless telomere-to-telomere assemblies of complex genomes is one of the ultimate challenges in genomics. We use two independent assemblies and an optical map-based merging pipeline to produce a maize genome (B73-Ab10) composed of 63 contigs and a contig N50 of 162 Mb. T

4/6/2023 6:11:50 AM +00:00

Accounting for cell type hierarchy in evaluating single cell RNA-seq clustering

Cell clustering is one of the most common routines in single cell RNA-seq data analyses, for which a number of specialized methods are available. The evaluation of these methods ignores an important biological characteristic that the structure for a population of cells is hierarchical, which could result in misleading evaluation results.

4/6/2023 6:11:40 AM +00:00

PTWAS: Investigating tissue-relevant causal molecular mechanisms of complex traits using probabilistic TWAS analysis

We propose a new computational framework, probabilistic transcriptome-wide association study (PTWAS), to investigate causal relationships between gene expressions and complex traits. PTWAS applies the established principles from instrumental variables analysis and takes advantage of probabilistic eQTL annotations to delineate and tackle the unique challenges arising in TWAS.

4/6/2023 6:11:27 AM +00:00

BpForms and BcForms: A toolkit for concretely describing non-canonical polymers and complexes to facilitate global biochemical networks

Non-canonical residues, caps, crosslinks, and nicks are important to many functions of DNAs, RNAs, proteins, and complexes. However, we do not fully understand how networks of such non-canonical macromolecules generate behavior. One barrier is our limited formats for describing macromolecules.

4/6/2023 6:11:17 AM +00:00

Terminating contamination: Large-scale search identifies more than 2,000,000 contaminated entries in GenBank

Genomic analyses are sensitive to contamination in public databases caused by incorrectly labeled reference sequences. Here, we describe Conterminator, an efficient method to detect and remove incorrectly labeled sequences by an exhaustive all-against-all sequence comparison.

4/6/2023 6:11:08 AM +00:00

RNA structural dynamics regulate early embryogenesis through controlling transcriptome fate and function

Vertebrate early embryogenesis is initially directed by a set of maternal RNAs and proteins, yet the mechanisms controlling this program remain largely unknown. Recent transcriptome-wide studies on RNA structure have revealed its pervasive and crucial roles in RNA processing and functions, but whether and how RNA structure regulates the fate of the maternal transcriptome have yet to be determined.

4/6/2023 6:10:58 AM +00:00

Compressing gene expression data using multiple latent space dimensionalities learns complementary biological representations

Unsupervised compression algorithms applied to gene expression data extract latent or hidden signals representing technical and biological sources of variation. However, these algorithms require a user to select a biologically appropriate latent space dimensionality.

4/6/2023 6:10:46 AM +00:00

Lifestyle and the presence of helminths is associated with gut microbiome composition in Cameroonians

African populations provide a unique opportunity to interrogate hostmicrobe co-evolution and its impact on adaptive phenotypes due to their genomic, phenotypic, and cultural diversity. We integrate gut microbiome 16S rRNA amplicon and shotgun metagenomic sequence data with quantification of pathogen burden and measures of immune parameters for 575 ethnically diverse Africans from Cameroon.

4/6/2023 6:10:39 AM +00:00

A human lung tumor microenvironment interactome identifies clinically relevant cell-type cross-talk

Tumors comprise a complex microenvironment of interacting malignant and stromal cell types. Much of our understanding of the tumor microenvironment comes from in vitro studies isolating the interactions between malignant cells and a single stromal cell type, often along a single pathway.

4/6/2023 6:10:27 AM +00:00

CircAtlas: An integrated resource of one million highly accurate circular RNAs from 1070 vertebrate transcriptomes

Existing circular RNA (circRNA) databases have become essential for transcriptomics. However, most are unsuitable for mining in-depth information for candidate circRNA prioritization. To address this, we integrate circular transcript collections to develop the circAtlas database based on 1070 RNA-seq samples collected from 19 normal tissues across six vertebrate species.

4/6/2023 6:10:16 AM +00:00

ExpansionHunter Denovo: A computational method for locating known and novel repeat expansions in short-read sequencing data

Repeat expansions are responsible for over 40 monogenic disorders, and undoubtedly more pathogenic repeat expansions remain to be discovered. Existing methods for detecting repeat expansions in short-read sequencing data require predefined repeat catalogs. Recent discoveries emphasize the need for methods that do not require pre-specified candidate repeats.

4/6/2023 6:10:08 AM +00:00

MOFA+: A statistical framework for comprehensive integration of multi-modal single-cell data

Technological advances have enabled the profiling of multiple molecular layers at single-cell resolution, assaying cells from multiple samples or conditions. Consequently, there is a growing need for computational strategies to analyze data from complex experimental designs that include multiple data modalities and multiple groups of samples.

4/6/2023 6:09:55 AM +00:00

CCMetagen: Comprehensive and accurate identification of eukaryotes and prokaryotes in metagenomic data

There is an increasing demand for accurate and fast metagenome classifiers that can not only identify bacteria, but all members of a microbial community. We used a recently developed concept in read mapping to develop a highly accurate metagenomic classification pipeline named CCMetagen.

4/6/2023 6:09:48 AM +00:00

FORK-seq: Replication landscape of the Saccharomyces cerevisiae genome by nanopore sequencing

Genome replication mapping methods profile cell populations, masking cell-to-cell heterogeneity. Here, we describe FORK-seq, a nanopore sequencing method to map replication of single DNA molecules at 200-nucleotide resolution.

4/6/2023 6:09:39 AM +00:00

REPIC: A database for exploring the N6 - methyladenosine methylome

The REPIC (RNA EPItranscriptome Collection) database records about 10 million peaks called from publicly available m6 A-seq and MeRIP-seq data using our unified pipeline. These data were collected from 672 samples of 49 studies, covering 61 cell lines or tissues in 11 organisms.

4/6/2023 6:09:31 AM +00:00

Developmental regulation of canonical and small ORF translation from mRNAs

Ribosomal profiling has revealed the translation of thousands of sequences outside annotated protein-coding genes, including small open reading frames of less than 100 codons, and the translational regulation of many genes.

4/6/2023 6:09:18 AM +00:00

Integrative analyses of the RNA modification machinery reveal tissue- and cancer-specific signatures

RNA modifications play central roles in cellular fate and differentiation. However, the machinery responsible for placing, removing, and recognizing more than 170 RNA modifications remains largely uncharacterized and poorly annotated, and we currently lack integrative studies that identify which RNA modificationrelated proteins (RMPs) may be dysregulated in each cancer type.

4/6/2023 6:09:05 AM +00:00

Influenza infection elicits an expansion of gut population of endogenous Bifidobacterium animalis which protects mice against infection

Influenza is a severe respiratory illness that continually threatens global health. It has been widely known that gut microbiota modulates the host response to protect against influenza infection, but mechanistic details remain largely unknown.

4/6/2023 6:08:54 AM +00:00

Single-cell RNA-seq with spike-in cells enables accurate quantification of cellspecific drug effects in pancreatic islets

Single-cell RNA-seq (scRNA-seq) is emerging as a powerful tool to dissect cell-specific effects of drug treatment in complex tissues. This application requires high levels of precision, robustness, and quantitative accuracy—beyond those achievable with existing methods for mainly qualitative single-cell analysis.

4/6/2023 6:08:45 AM +00:00

Lamina-associated domains: Peripheral matters and internal affairs

At the nuclear periphery, associations of chromatin with the nuclear lamina through lamina-associated domains (LADs) aid functional organization of the genome. We review the organization of LADs and provide evidence of LAD heterogeneity from cell ensemble and single-cell data.

4/6/2023 6:08:37 AM +00:00

Leveraging biological and statistical covariates improves the detection power in epigenome-wide association testing

Epigenome-wide association studies (EWAS), which seek the association between epigenetic marks and an outcome or exposure, involve multiple hypothesis testing. False discovery rate (FDR) control has been widely used for multiple testing correction. However, traditional FDR control methods do not use auxiliary covariates, and they could be less powerful if the covariates could inform the likelihood of the null hypothesis.

4/6/2023 6:08:27 AM +00:00

Inference of single-cell phylogenies from lineage tracing data using Cassiopeia

The pairing of CRISPR/Cas9-based gene editing with massively parallel single-cell readouts now enables large-scale lineage tracing. However, the rapid growth in complexity of data from these assays has outpaced our ability to accurately infer phylogenetic relationships.

4/6/2023 6:08:19 AM +00:00

The preceding root system drives the composition and function of the rhizosphere microbiome

The soil environment is responsible for sustaining most terrestrial plant life, yet we know surprisingly little about the important functions carried out by diverse microbial communities in soil. Soil microbes that inhabit the channels of decaying root systems, the detritusphere, are likely to be essential for plant growth and health, as these channels are the preferred locations of new root growth.

4/6/2023 6:08:11 AM +00:00

Wheat chromatin architecture is organized in genome territories and transcription factories

Polyploidy is ubiquitous in eukaryotic plant and fungal lineages, and it leads to the co-existence of several copies of similar or related genomes in one nucleus. In plants, polyploidy is considered a major factor in successful domestication. H

4/6/2023 6:07:59 AM +00:00