Tài liệu miễn phí Sinh học

Download Tài liệu học tập miễn phí Sinh học

Membrane protein orientation and refinement using a knowledge-based statistical potential

Recent increases in the number of deposited membrane protein crystal structures necessitate the use of automated computational tools to position them within the lipid bilayer. Identifying the correct orientation allows us to study the complex relationship between sequence, structure and the lipid environment, which is otherwise challenging to investigate using experimental techniques due to the difficulty in crystallising membrane proteins embedded within intact membranes.

12/29/2020 4:57:36 PM +00:00

A molecular model of the full-length human NOD-like receptor family CARD domain containing 5 (NLRC5) protein

Pattern recognition receptors of the immune system have key roles in the regulation of pathways after the recognition of microbial- and danger-associated molecular patterns in vertebrates. Members of NOD-like receptor (NLR) family typically function intracellularly.

12/29/2020 4:57:28 PM +00:00

Comparing a few SNP calling algorithms using low-coverage sequencing data

Many Single Nucleotide Polymorphism (SNP) calling programs have been developed to identify Single Nucleotide Variations (SNVs) in next-generation sequencing (NGS) data. However, low sequencing coverage presents challenges to accurate SNV identification, especially in single-sample data.

12/29/2020 4:57:22 PM +00:00

Learning a Markov Logic network for supervised gene regulatory network inference

Gene regulatory network inference remains a challenging problem in systems biology despite the numerous approaches that have been proposed. When substantial knowledge on a gene regulatory network is already available, supervised network inference is appropriate.

12/29/2020 4:57:15 PM +00:00

Levenshtein error-correcting barcodes for multiplexed DNA sequencing

High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track.

12/29/2020 4:57:09 PM +00:00

MultiChIPmixHMM: An R package for ChIP-chip data analysis modeling spatial dependencies and multiple replicates

Chromatin immunoprecipitation coupled with hybridization to a tiling array (ChIP-chip) is a cost-effective and routinely used method to identify protein-DNA interactions or chromatin/histone modifications. The robust identification of ChIP-enriched regions is frequently complicated by noisy measurements.

12/29/2020 4:57:02 PM +00:00

Maximum-parsimony haplotype frequencies inference based on a joint constrained sparse representation of pooled DNA

DNA pooling constitutes a cost effective alternative in genome wide association studies. In DNA pooling, equimolar amounts of DNA from different individuals are mixed into one sample and the frequency of each allele in each position is observed in a single genotype experiment.

12/29/2020 4:56:53 PM +00:00

Efficient alignment of RNA secondary structures using sparse dynamic programming

Current advances of the next-generation sequencing technology have revealed a large number of un-annotated RNA transcripts. Comparative study of the RNA structurome is an important approach to assess their biological functionalities.

12/29/2020 4:56:47 PM +00:00

Centroid based clustering of high throughput sequencing reads based on n-mer counts

Many problems in computational biology require alignment-free sequence comparisons. One of the common tasks involving sequence comparison is sequence clustering. Here we apply methods of alignment-free comparison (in particular, comparison using sequence composition) to the challenge of sequence clustering.

12/29/2020 4:56:40 PM +00:00

Pathway-PDT: A flexible pathway analysis tool for nuclear families

Pathway analysis based on Genome-Wide Association Studies (GWAS) data has become popular as a secondary analysis strategy. Although many pathway analysis tools have been developed for case–control studies, there is no tool that can use all information from raw genotypes in general nuclear families.

12/29/2020 4:56:34 PM +00:00

Two Pfam protein families characterized by a crystal structure of protein lpg2210 from Legionella pneumophila

Every genome contains a large number of uncharacterized proteins that may encode entirely novel biological systems. Many of these uncharacterized proteins fall into related sequence families. By applying sequence and structural analysis we hope to provide insight into novel biology.

12/29/2020 4:56:26 PM +00:00

Ontology based molecular signatures for immune cell types via gene expression analysis

New technologies are focusing on characterizing cell types to better understand their heterogeneity. With large volumes of cellular data being generated, innovative methods are needed to structure the resulting data analyses.

12/29/2020 4:56:19 PM +00:00

NPEBseq: Nonparametric empirical bayesianbased procedure for differential expression analysis of RNA-seq data

RNA-seq, a massive parallel-sequencing-based transcriptome profiling method, provides digital data in the form of aligned sequence read counts. The comparative analyses of the data require appropriate statistical methods to estimate the differential expression of transcript variants across different cell/tissue types and disease conditions.

12/29/2020 4:56:12 PM +00:00

A balanced iterative random forest for gene selection from microarray data

The wealth of gene expression values being generated by high throughput microarray technologies leads to complex high dimensional datasets. Moreover, many cohorts have the problem of imbalanced classes where the number of patients belonging to each class is not the same.

12/29/2020 4:56:06 PM +00:00

A novel method to assess collagen architecture in skin

Texture within biological specimens may reveal critical insights, while being very difficult to quantify. This is a particular problem in histological analysis. For example, cross-polar images of picrosirius stained skin reveal exquisite structure, allowing changes in the basketweave conformation of healthy collagen to be assessed.

12/29/2020 4:55:57 PM +00:00

MethyQA: A pipeline for bisulfite-treated methylation sequencing quality assessment

DNA methylation is an epigenetic event that adds a methyl-group to the 5’ cytosine. This epigenetic modification can significantly affect gene expression in both normal and diseased cells.

12/29/2020 4:55:51 PM +00:00

Differential expression analysis with global network adjustment

Large-scale chromosomal deletions or other non-specific perturbations of the transcriptome can alter the expression of hundreds or thousands of genes, and it is of biological interest to understand which genes are most profoundly affected.

12/29/2020 4:55:44 PM +00:00

The GMOseek matrix: A decision support tool for optimizing the detection of genetically modified plants

Since their first commercialization, the diversity of taxa and the genetic composition of transgene sequences in genetically modified plants (GMOs) are constantly increasing. To date, the detection of GMOs and derived products is commonly performed by PCR-based methods targeting specific DNA sequences introduced into the host genome.

12/29/2020 4:55:38 PM +00:00

A multiple-alignment based primer design algorithm for genetically highly variable DNA targets

Primer design for highly variable DNA sequences is difficult, and experimental success requires attention to many interacting constraints. The advent of next-generation sequencing methods allows the investigation of rare variants otherwise hidden deep in large populations, but requires attention to population diversity and primer localization in relatively conserved regions, in addition to recognized constraints typically considered in primer design.

12/29/2020 4:55:31 PM +00:00

A flexible count data model to fit the wide diversity of expression profiles arising from extensively replicated RNA-seq experiments

High-throughput RNA sequencing (RNA-seq) offers unprecedented power to capture the real dynamics of gene expression. Experimental designs with extensive biological replication present a unique opportunity to exploit this feature and distinguish expression profiles with higher resolution.

12/29/2020 4:55:22 PM +00:00

Hierarchical Bayesian modelling of gene expression time series across irregularly sampled replicates and clusters

Time course data from microarrays and high-throughput sequencing experiments require simple, computationally efficient and powerful statistical models to extract meaningful biological signal, and for tasks such as data fusion and clustering.

12/29/2020 4:55:14 PM +00:00

GRank: A middleware search engine for ranking genes by relevance to given genes

Biologists may need to know the set of genes that are semantically related to a given set of genes. For instance, a biologist may need to know the set of genes related to another set of genes known to be involved in a specific disease. Some works use the concept of gene clustering in order to identify semantically related genes.

12/29/2020 4:55:07 PM +00:00

Conversion of KEGG metabolic pathways to SBGN maps including automatic layout

Biologists make frequent use of databases containing large and complex biological networks. One popular database is the Kyoto Encyclopedia of Genes and Genomes (KEGG) which uses its own graphical representation and manual layout for pathways.

12/29/2020 4:54:58 PM +00:00

Gentrepid V2.0: A web server for candidate disease gene prediction

Candidate disease gene prediction is a rapidly developing area of bioinformatics research with the potential to deliver great benefits to human health. As experimental studies detecting associations between genetic intervals and disease proliferate, better bioinformatic techniques that can expand and exploit the data are required.

12/29/2020 4:54:52 PM +00:00

PKIS: Computational identification of protein kinases for experimentally discovered protein phosphorylation sites

Dynamic protein phosphorylation is an essential regulatory mechanism in various organisms. In this capacity, it is involved in a multitude of signal transduction pathways. Kinase-specific phosphorylation data lay the foundation for reconstruction of signal transduction networks.

12/29/2020 4:54:45 PM +00:00

ArkMAP: Integrating genomic maps across species and data sources

The visualisation of genetic and genomic maps aligned within and between species and across data sources can be used to inform studies of genome evolution, assist genome assembly projects and aid gene discovery and identification.

12/29/2020 4:54:38 PM +00:00

Group sparse canonical correlation analysis for genomic data integration

In this paper, we focus on a multivariate statistical method, canonical correlation analysis (CCA) method for this problem. Conventional CCA method does not work effectively if the number of data samples is significantly less than that of biomarkers, which is a typical case for genomic data (e.g., SNPs).

12/29/2020 4:54:31 PM +00:00

RCircos: An R package for Circos 2D track plots

Circos is a Perl language based software package for visualizing similarities and differences of genome structure and positional relationships between genomic intervals. Running Circos requires extra data processing procedures to prepare plot data files and configure files from datasets, which limits its capability of integrating directly with other software tools such as R.

12/29/2020 4:54:22 PM +00:00

GOParGenPy: A high throughput method to generate Gene Ontology data matrices

Gene Ontology (GO) is a popular standard in the annotation of gene products and provides information related to genes across all species. The structure of GO is dynamic and is updated on a daily basis.

12/29/2020 4:54:16 PM +00:00

MatrixCatch - a novel tool for the recognition of composite regulatory elements in promoters

Accurate recognition of regulatory elements in promoters is an essential prerequisite for understanding the mechanisms of gene regulation at the level of transcription. Composite regulatory elements represent a particular type of such transcriptional regulatory elements consisting of pairs of individual DNA motifs.

12/29/2020 4:54:09 PM +00:00