Tài liệu miễn phí Sinh học

Download Tài liệu học tập miễn phí Sinh học

The somatic mutation landscape of the human body

Somatic mutations in healthy tissues contribute to aging, neurodegeneration, and cancer initiation, yet they remain largely uncharacterized. Results: To gain a better understanding of the genome-wide distribution and functional impact of somatic mutations, we leverage the genomic information contained in the transcriptome to uniformly call somatic mutations from over 7500 tissue samples, representing 36 distinct tissues

4/6/2023 9:14:44 AM +00:00

A comparison framework and guideline of clustering methods for mass cytometry data

With the expanding applications of mass cytometry in medical research, a wide variety of clustering methods, both semi-supervised and unsupervised, have been developed for data analysis. Selecting the optimal clustering method can accelerate the identification of meaningful cell populations.

4/6/2023 9:14:38 AM +00:00

RADAR: Differential analysis of MeRIP-seq data with a random effect model

Epitranscriptome profiling using MeRIP-seq is a powerful technique for in vivo functional studies of reversible RNA modifications. We develop RADAR, a comprehensive analytical tool for detecting differentially methylated loci in MeRIP-seq data. RADAR enables accurate identification of altered methylation sites by accommodating variability of pre-immunoprecipitation expression level and post-immunoprecipitation count using different strategies.

4/6/2023 9:14:30 AM +00:00

Within-species contamination of bacterial whole-genome sequence data has a greater influence on clustering analyses than between-species contamination

Although it is assumed that contamination in bacterial whole-genome sequencing causes errors, the influences of contamination on clustering analyses, such as single-nucleotide polymorphism discovery, phylogenetics, and multilocus sequencing typing, have not been quantified.

4/6/2023 9:14:20 AM +00:00

SyRI: Finding genomic rearrangements and local sequence differences from wholegenome assemblies

Genomic differences range from single nucleotide differences to complex structural variations. Current methods typically annotate sequence differences ranging from SNPs to large indels accurately but do not unravel the full complexity of structural rearrangements, including inversions, translocations, and duplications, where highly similar sequence changes in location, orientation, or copy number.

4/6/2023 9:14:11 AM +00:00

Transcriptome assembly from long-read RNA-seq alignments with StringTie2

RNA sequencing using the latest single-molecule sequencing instruments produces reads that are thousands of nucleotides long. The ability to assemble these long reads can greatly improve the sensitivity of long-read analyses. Here we present StringTie2, a reference-guided transcriptome assembler that works with both short and long reads.

4/6/2023 9:14:04 AM +00:00

Paragraph: A graph-based structural variant genotyper for short-read sequence data

Accurate detection and genotyping of structural variations (SVs) from short-read data is a long-standing area of development in genomics research and clinical sequencing pipelines. We introduce Paragraph, an accurate genotyper that models SVs using sequence graphs and SV annotations.

4/6/2023 9:13:58 AM +00:00

Curing hemophilia A by NHEJ-mediated ectopic F8 insertion in the mouse

Hemophilia A, a bleeding disorder resulting from F8 mutations, can only be cured by gene therapy. A promising strategy is CRISPR-Cas9-mediated precise insertion of F8 in hepatocytes at highly expressed gene loci, such as albumin (Alb). Unfortunately, the precise in vivo integration efficiency of a long insert is very low (~ 0.1%).

4/6/2023 9:13:50 AM +00:00

Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline

Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and provide an opportunity for comprehensive annotation of TEs. Numerous methods exist for annotation of each class of TEs, but their relative performances have not been systematically compared.

4/6/2023 9:13:43 AM +00:00

TRITEX: Chromosome-scale sequence assembly of Triticeae genomes with opensource tools

Chromosome-scale genome sequence assemblies underpin pan-genomic studies. Recent genome assembly efforts in the large-genome Triticeae crops wheat and barley have relied on the commercial closed-source assembly algorithm DeNovoMagic. We present TRITEX, an open-source computational workflow that combines paired-end, mate-pair, 10X Genomics linked-read with chromosome conformation capture sequencing data to construct sequence scaffolds with megabase-scale contiguity ordered into chromosomal pseudomolecules.

4/6/2023 9:13:37 AM +00:00

Accuracy, robustness and scalability of dimensionality reduction methods for single-cell RNA-seq analysis

Dimensionality reduction is an indispensable analytic component for many areas of single-cell RNA sequencing (scRNA-seq) data analysis. Proper dimensionality reduction can allow for effective noise removal and facilitate many downstream analyses that include cell clustering and lineage reconstruction.

4/6/2023 9:13:29 AM +00:00

PASTMUS: Mapping functional elements at single amino acid resolution in human cells

Identification of functional elements for a protein of interest is important for achieving a mechanistic understanding. However, it remains cumbersome to assess each and every amino acid of a given protein in relevance to its functional significance.

4/6/2023 9:13:22 AM +00:00

CTCF modulates allele-specific sub-TAD organization and imprinted gene activity at the mouse Dlk1-Dio3 and Igf2-H19 domains

Genomic imprinting is essential for mammalian development and provides a unique paradigm to explore intra-cellular differences in chromatin configuration. So far, the detailed allele-specific chromatin organization of imprinted gene domains has mostly been lacking.

4/6/2023 9:13:16 AM +00:00

Clustered CTCF binding is an evolutionary mechanism to maintain topologically associating domains

CTCF binding contributes to the establishment of a higher-order genome structure by demarcating the boundaries of large-scale topologically associating domains (TADs). However, despite the importance and conservation of TADs, the role of CTCF binding in their evolution and stability remains elusive.

4/6/2023 9:13:08 AM +00:00

OnTAD: Hierarchical domain structure reveals the divergence of activity among TADs and boundaries

The spatial organization of chromatin in the nucleus has been implicated in regulating gene expression. Maps of high-frequency interactions between different segments of chromatin have revealed topologically associating domains (TADs), within which most of the regulatory interactions are thought to occur.

4/6/2023 9:13:02 AM +00:00

Whole genome DNA sequencing provides an atlas of somatic mutagenesis in healthy human cells and identifies a tumor-prone cell type

The lifelong accumulation of somatic mutations underlies age-related phenotypes and cancer. Mutagenic forces are thought to shape the genome of aging cells in a tissue-specific way. Whole genome analyses of somatic mutation patterns, based on both types and genomic distribution of variants, can shed light on specific processes active in different human tissues and their effect on the transition to cancer.

4/6/2023 9:12:55 AM +00:00

Improved metagenomic analysis with Kraken 2

Although Kraken’s k-mer-based approach provides a fast taxonomic classification of metagenomic sequence data, its large memory requirements can be limiting for some applications. Kraken 2 improves upon Kraken 1 by reducing memory usage by 85%, allowing greater amounts of reference genomic data to be used, while maintaining high accuracy and increasing speed fivefold.

4/6/2023 9:12:48 AM +00:00

The Pseudomonas aeruginosa accessory genome elements influence virulence towards Caenorhabditis elegans

Multicellular animals and bacteria frequently engage in predator-prey and host-pathogen interactions, such as the well-studied relationship between Pseudomonas aeruginosa and the nematode Caenorhabditis elegans. This study investigates the genomic and genetic basis of bacterial-driven variability in P. aeruginosa virulence towards C. elegans to provide evolutionary insights into host-pathogen relationships.

4/6/2023 9:12:41 AM +00:00

Chromosome-level genome assembly for giant panda provides novel insights into Carnivora chromosome evolution

Chromosome evolution is an important driver of speciation and species evolution. Previous studies have detected chromosome rearrangement events among different Carnivora species using chromosome painting strategies.

4/6/2023 9:12:34 AM +00:00

The Aquilegia genome reveals a hybrid origin of core eudicots

Whole-genome duplications (WGDs) have dominated the evolutionary history of plants. One consequence of WGD is a dramatic restructuring of the genome as it undergoes diploidization, a process under which deletions and rearrangements of various sizes scramble the genetic material, leading to a repacking of the genome and eventual return to diploidy.

4/6/2023 9:12:26 AM +00:00

Guidelines for benchmarking of optimization-based approaches for fitting mathematical models

Insufficient performance of optimization-based approaches for the fitting of mathematical models is still a major bottleneck in systems biology. In this article, the reasons and methodological challenges are summarized as well as their impact in benchmark studies. Important aspects for achieving an increased level of evidence for benchmark results are discussed.

4/6/2023 9:12:20 AM +00:00

Quantifying the benefit offered by transcript assembly with Scallop-LR on single-molecule long reads

Single-molecule long-read sequencing has been used to improve mRNA isoform identification. However, not all single-molecule long reads represent full transcripts due to incomplete cDNA synthesis and sequencing length limits.

4/6/2023 9:12:12 AM +00:00

CRISPR-Cas13d mediates robust RNA virus interference in plants

CRISPR-Cas systems endow bacterial and archaeal species with adaptive immunity mechanisms to fend off invading phages and foreign genetic elements. CRISPR-Cas9 has been harnessed to confer virus interference against DNA viruses in eukaryotes, including plants.

4/6/2023 9:11:59 AM +00:00

PIRCh-seq: Functional classification of noncoding RNAs associated with distinct histone modifications

We develop PIRCh-seq, a method which enables a comprehensive survey of chromatin-associated RNAs in a histone modification-specific manner. We identify hundreds of chromatin-associated RNAs in several cell types with substantially less contamination by nascent transcripts.

4/6/2023 9:11:49 AM +00:00

Gut-derived Enterococcus faecium from ulcerative colitis patients promotes colitis in a genetically susceptible mouse host

Recent metagenomic analyses have revealed dysbiosis of the gut microbiota of ulcerative colitis (UC) patients. However, the impacts of this dysbiosis are not fully understood, particularly at the strain level.

4/6/2023 9:11:40 AM +00:00

Evaluation of commonly used analysis strategies for epigenome- and transcriptome-wide association studies through replication of large-scale population studies

A large number of analysis strategies are available for DNA methylation (DNAm) array and RNA-seq datasets, but it is unclear which strategies are best to use. We compare commonly used strategies and report how they influence results in large cohort studies.

4/6/2023 9:11:31 AM +00:00

Dashing: Fast and accurate genomic distances with HyperLogLog

Dashing is a fast and accurate software tool for estimating similarities of genomes or sequencing datasets. It uses the HyperLogLog sketch together with cardinality estimation methods that are specialized for set unions and intersections.

4/6/2023 9:11:25 AM +00:00

Normalization and variance stabilization of single-cell RNA-seq data using regularized negative binomial regression

Single-cell RNA-seq (scRNA-seq) data exhibits significant cell-to-cell variation due to technical factors, including the number of molecules detected in each cell, which can confound biological heterogeneity with technical effects. To address this, we present a modeling framework for the normalization and variance stabilization of molecular count data from scRNA-seq experiments.

4/6/2023 9:11:18 AM +00:00

The majority of A-to-I RNA editing is not required for mammalian homeostasis

Adenosine-to-inosine (A-to-I) RNA editing, mediated by ADAR1 and ADAR2, occurs at tens of thousands to millions of sites across mammalian transcriptomes. A-to-I editing can change the protein coding potential of a transcript and alter RNA splicing, miRNA biology, RNA secondary structure and formation of other RNA species

4/6/2023 9:11:11 AM +00:00

Afann: Bias adjustment for alignment-free sequence comparison based on sequencing data using neural network regression

Alignment-free methods, more time and memory efficient than alignment-based methods, have been widely used for comparing genome sequences or raw sequencing samples without assembly.

4/6/2023 9:11:04 AM +00:00