🛠️Nextflow 24.04, 🧬MoLPC2: Prediction of Protein Complexes using AlphaFold2, 🧫 scRNA-seq with GeneExt, 🦠ViralFlow v1.0: Viral Genomic Surveillance
Bioinformer Weekly Roundup
Stay Updated with the Latest in Bioinformatics!
Issue: 39 | Date: 31 May 2024
👋 Welcome to the Bioinformer Weekly Roundup!
In this newsletter, we curate and bring you the most captivating stories, developments, and breakthroughs from the world of bioinformatics. Whether you're a seasoned researcher, a student, or simply curious about the intersection of biology and data science, we've got you covered. Subscribe now to stay ahead in the exciting realm of bioinformatics!
🔬 Featured Research
The study develops an RNA-sequencing approach to detect potentially pathogenic intronic variants often overlooked in genomic research. Applied to pancreatic cancer specimens, it links these variants to splicing dysregulation. This could enable the use of intronic mutations as prognostic markers and therapeutic targets in cancer.
Data is available here.
This paper discusses the use of single-cell RNA sequencing data for cross-species comparisons, highlighting its potential to reveal evolutionary links. It proposes integrating phylogenetic approaches into these analyses to overcome challenges and robustly test hypotheses about gene and cell evolution.
The study delves into the area of fungi-associated viruses using advanced sequencing technologies. It identifies 12 new RNA viral sequences, potentially representing new taxa. The research underscores the importance of these findings in understanding the diversity and evolution of fungal viruses and calls for further investigation into their interactions.
The study uses disease modules to explore potential drugs for Chronic Heart Failure (CHF) treatment and the role of Panax ginseng in ameliorating CHF. It suggests that drugs like dasatinib and midostaurin, and key components of Panax ginseng, could be promising for CHF treatment. The study highlights the effectiveness of disease module analysis in drug repurposing and understanding traditional Chinese medicine mechanisms.
The study aims to design vaccines and drugs for brucellosis using protein modeling, epitope prediction, and molecular docking. It identifies dominant B cell, cytotoxic T lymphocyte, and T helper lymphocyte epitopes in target proteins. Virtual screening reveals five potential compounds from the ZINC database that could inhibit Brucella’s proteins.
The study investigates the role of long noncoding RNA (lncRNA) in IgA nephropathy (IgAN). It identifies differentially expressed mRNAs and lncRNAs in IgAN samples, and constructs a regulatory network model of lncRNA-transcription factor-mRNA. The study suggests that key lncRNAs and transcription factors could provide insight into the pathogenesis of IgAN.
This study introduces a genomic analysis workflow to distinguish between two PIK3CA mutations in cancer. Using RNA-seq and ATAC-seq, it identifies unique transcriptomic and epigenomic differences and highlights AREG as a mutation-specific target. These findings suggest distinct genomic regulation between PIK3CA mutations, potentially leading to mutation-specific cancer treatment targets.
Data is available here.
🛠️ Latest Tools
Nextflow 24.04 introduces new features like Seqera Containers, Workflow Output Definition, Topic Channels, Process Eval Outputs, Resource Limits, and Job Arrays. Enhancements include Colored Logs, AWS Fargate Support, OCI Auto Pull Mode, and Support for GA4GH TES. You can view the full updates here.
The new release of SEDA, initially designed to aid life science researchers with DNA and protein sequence preparation, now includes a command-line interface and supports automated analysis pipelines. It also introduces gene annotation functionality and better Linux integration.
Source code is available here.
Dune, a new method for single-cell transcriptome sequencing (scRNA-Seq) datasets, optimizes the trade-off between cluster resolution and replicability. It iteratively merges clusters within partitions to maximize concordance, enhancing robustness and reducing reliance on tuning parameters.
Source code is available here.
A machine learning pipeline has been developed to identify somatic mutations (SMs) using RNA-seq data, leading to the discovery of over 105,000 novel SMs that had not been reported in previous TCGA studies. These findings, combined with DNA-seq analyses, offer an updated mutational landscape across 32 cancer types in a new online SM atlas, OncoDB.
Source code is available here.
The study introduces approach for cell type annotation in single-cell RNA sequencing data using unique marker gene sets. The method, which builds on the AUCell algorithm, demonstrates improved performance over existing reference-based tools in accuracy and efficiency, particularly in distinguishing closely related subtypes. A user-friendly application has also been developed to automate the cell typing process.
Data is available here.
ViralFlow v1.0, initially developed for viral genomic surveillance, has evolved into a general-purpose reference-based genome assembler for all viruses with an available reference genome. It includes new modules for studying nucleotide and amino acid mutations and operates on various computational infrastructures. It produces standard outputs for public health reporting and scientific problem-solving.
Source code is available here.
Recommended by LinkedIn
GeneExt is a tool that refines gene annotations using single-cell RNA sequencing data. It addresses the issue of incomplete gene models, particularly in non-model species, where inaccurate annotation of gene 3’ ends can lead to incorrect quantification or non-detection of many genes. This allows for better cross-species comparisons of cell type expression atlases.
Source code is available here.
MoLPC2, an enhanced version of MoLPC, improves the prediction of large protein complex structures by simultaneously predicting their stoichiometry. It uses Monte Carlo Tree Search algorithms and allows sampling alternative AlphaFold predictions. This enables structure prediction of protein complexes without prior knowledge of stoichiometry, as demonstrated in 50 out of 175 non-redundant protein complexes.
Source code is available here.
📰 Community News
The Epstein-Barr virus (EBV), discovered 60 years ago, is the first virus proven to be linked to human cancer. Carried by 90% of adults without symptoms, inhibiting a specific metabolic pathway in infected cells can reduce latent infection and lower the risk of diseases, including cancers and autoimmune disorders. This discovery was made by researchers from the University of Basel and the University Hospital Basel.
New study from Feinstein Institutes and King’s College London identifies gene expression patterns linked to schizophrenia, bipolar disorder, and depression, focusing on Human Endogenous Retroviruses (HERVs). HERVs, constituting 8% of the genome, may regulate nearby genes and are linked to psychiatric conditions. These findings pave the way for future therapeutic research.
A consortium of researchers has developed comprehensive maps of gene regulation networks in the brains of people with and without mental disorders. The study, which used brain tissue from over 2,500 donors, advances our understanding of genetic risk factors for mental disorders and identifies potential targets for new therapeutics.
📅 Upcoming Events
The Nextflow Summit is a key event for scientific data processing and analysis, featuring talks, poster sessions, and networking. It highlights the latest developments in the Nextflow domain, with a focus on open science.
Combination of Multi-Omics and Microbiome R&D, Informatics, and Technology into one Virtual Conference. There will be speakers from Industry, Academia, and Government where you can see the latest and greatest accomplishments and offerings.
This course offers a thorough grasp of genome-resolved metagenomics, encompassing data management, analysis, and interpretation using public resources. Participants engage in live lectures, Q&As, group activities, and computational exercises to gain practical experience. Designed for early-stage life scientists with prior bioinformatics exposure.
This course offers extensive training in single-cell RNA-seq data analysis using Python and command line tools. Participants learn droplet-based analysis pipelines, from raw reads to cell clusters, through practical exercises and live sessions.
📚 Educational Corner
The blog post discusses R's S3 and S4 objects in the context of object-oriented programming (OOP), offering a simplified explanation with metaphors and practical examples.
The author shares insights into lesser-known but impactful practices that have enhanced their journey, transcending typical programming skills. These practices have enabled them to effectively address challenges and derive meaningful insights from data.
🔗 Connect with Us
Stay connected and engage with us on social media for daily updates, discussions, and more!
📬 Subscribe
Don't miss an issue! Subscribe to the Bioinformer Weekly Roundup and receive the latest insights directly in your inbox.
We hope you enjoyed this week's edition of the Bioinformer Weekly Roundup. Feel free to share it with your colleagues and friends who share your passion for bioinformatics!
Disclaimer: The information provided in this newsletter is for educational and informational purposes only and does not constitute professional advice.
Contact: bioinformatics@zifornd.com
Copyright © 2024, Bioinformer Weekly Roundup. All rights reserved.