VCF analysis with VCF Observer 📊, scDist for Cell-Type Variability 🧬, MicroRNAs in Alzheimer's 🧠, EnzChemRED: Enzyme Chemistry Relation 🧪
Bioinformer Weekly Roundup
Stay Updated with the Latest in Bioinformatics!
Issue: 54 | Date: 13 September 2024
👋 Welcome to the Bioinformer Weekly Roundup!
In this newsletter, we curate and bring you the most captivating stories, developments, and breakthroughs from the world of bioinformatics. Whether you're a seasoned researcher, a student, or simply curious about the intersection of biology and data science, we've got you covered. Subscribe now to stay ahead in the exciting realm of bioinformatics!
🔬 Featured Research
The study introduces an information content approach to better quantify the clinical utility of multiplexed assays of variant effect (MAVE). By measuring information gain in bits for BRCA1, PTEN, and TP53 MAVEs, the method captures more comprehensive data than the traditional variant reclassification approach.
This study focuses on the RTEL1 gene, a key player in telomere maintenance, and its associated mutations linked to various cancers and diseases like Dyskeratosis congenita (DC) and Hoyeraal–Hreidarsson syndrome (HHS). Through in silico analysis, 11 deleterious nsSNPs and 3 non-coding SNPs were identified that significantly impact RTEL1's structure and function.
This study investigated microbial growth in dust from the International Space Station (ISS) under varying moisture conditions, revealing significant fungal and bacterial growth when relative humidity exceeds 80-90%. Fungal diversity decreases as moisture increases, which may affect both health and spacecraft integrity.
This article examines the role of microRNAs in Alzheimer’s disease, focusing on their influence on β-amyloid accumulation, tau protein aggregation, mitochondrial dysfunction, and neuroinflammation. It highlights the potential of miRNAs as diagnostic markers and therapeutic targets in AD treatment.
This study evaluated PANoptosis-related genes (PRGs) in lung adenocarcinoma (LUAD) to assess their role in prognosis, tumour microenvironment, and treatment response. Through regression analyses, seven PRGs were identified, allowing LUAD patients to be classified into different survival risk groups.
This study aimed to develop a therapeutic mRNA vaccine to prevent CNS tuberculosis (TB) by targeting the PknD and Rv0986 epitopes of Mycobacterium tuberculosis using in silico methods. The designed vaccine is optimized for translation in humans, predicted to be non-allergenic and highly antigenic.
🛠️ Latest Tools
The article presents ST-GEARS, a 3D spatial transcriptomics recovery system aimed at addressing spatial distortions and improving tissue reconstruction accuracy. ST-GEARS uses Distributive Constraints and elastic fields to retrieve precise cross-sectional mappings. The system was evaluated on multiple datasets, showing structural consistency and reliable tissue regionalization.
Source code is available here.
A new web application has been developed for analysing and comparing VCF files in genomics. It offers a user-friendly interface for uploading, analysing, and visualizing VCF files through drag-and-drop and point-and-click operations. It includes essential visualizations like Venn diagrams and precision–recall plots and supports metadata-based file grouping for streamlined analysis.
Source code is available here.
The article introduces scDist, a computational tool designed to detect cell-type-specific transcriptomic differences in single-cell RNA-seq data using a mixed-effects model. This tool reduces false positives arising from individual and cohort variability. It identifies immune cell perturbations in datasets related to COVID-19 and immunotherapy.
Source code is available here.
Cortexa is a web portal designed to study gene expression and alternative splicing in the murine brain. It provides valuable insights into the complex regulatory mechanisms at play in brain function and development, helping researchers understand the intricacies of neural processes.
Source code is available here.
HyperTraPS-CT is a tool for inferring and predicting accumulation pathways, in discrete or continuous time, using different model structures. It is designed to handle complex biological data, providing predictions for various biological processes and aiding in the understanding of disease progression and other biological phenomena.
Source code is available here.
This study proposes a simple feature selection model, called “Differentially Distributed Genes” (DDGs), which uses binomial models to uncover biological variation in droplet-based single-cell RNA sequencing. By enhancing the understanding of cellular heterogeneity, this approach improves the accuracy of single-cell analyses and helps identify significant biological differences across cell populations.
Source code is available here.
Bin2cell is a tool that reconstructs cells from high-resolution visium HD data by leveraging morphology image segmentation and gene expression information. It enables detailed cellular analysis and spatial mapping, providing deeper insights into tissue architecture and cellular interactions, which are crucial for understanding complex biological systems and disease mechanisms.
Source code is available here.
SINUM is introduced as a method for constructing single-cell networks (SCNs) using mutual information to estimate gene dependencies from scRNA-seq data. This approach aims to capture the network architecture of each cell and explore cell-to-cell heterogeneity. Experiments on various datasets validated this tool in cell type identification.
Source code is available here.
SAAMBE-MEM is a sequence-based method for predicting binding free energy changes upon mutation in membrane protein-protein complexes. This tool aids in understanding the effects of mutations on protein interactions and stability, which is essential for drug design and understanding disease mechanisms.
Source code is available here.
EnzChemRED is a dataset designed to aid enzyme curation using NLP methods. It includes 1,210 PubMed abstracts with annotations for enzymes and their reactions. Fine-tuning language models with EnzChemRED improves identification and extraction of proteins, chemicals, and their conversions, supporting curation efforts in UniProtKB and Rhea.
Recommended by LinkedIn
Dataset is available here.
Metapipeline-DNA is a configurable pipeline designed to facilitate DNA sequencing data analysis. It includes processes like read alignment, variant calling, quality control, and subclonal reconstruction. The pipeline is robust to failures and offers configuration options, simplifying large-scale DNA sequencing analysis in clinical and research settings.
Source code is available here.
The study introduces lr-kallisto, a tool for fast and accurate quantification of long-read RNA sequencing data, enhanced by exome capture. This method adapts existing RNA-seq quantification techniques for long-read technologies, working on challenges posed by isoform complexity and genetic variation.
Source code is available here.
A graph neural network framework is introduced to predict spatial gene expression from tissue histological images. This method offers a scalable alternative to costly spatial transcriptomics technologies. Experiments on breast cancer data show enhanced prediction performance and better delineation of spatial domains compared to current methods.
Source code is available here.
📰 Community News
Researchers found that a third of cells in human glioma can fire electrical impulses, challenging the belief that only neurons have this ability. These hybrid cells, which are part neuron and part glia, may have prognostic value, as patients with more of these cells showed improved survival. Similar cells were also found in non-tumor brains, indicating their importance in both glioma and normal brain function.
A study by Stockholm University scientists has used high-resolution genomic tools to map gene expression in Plasmodium falciparum, the deadliest malaria parasite, during its transition to gametocytes—essential for disease transmission via mosquitoes. The research identified key genetic regulators, particularly from the ApiAP2 transcription factor family, linked to male and female gametocyte development.
Researchers have discovered the “multiciliation cycle”, a cell cycle variant that helps airway cells grow cilia to clear bacteria and viruses. Unlike the normal cell cycle, these cells don’t divide or copy DNA but focus on cilia growth. The protein E2F7 is crucial for this process, and its absence can lead to issues like hydrocephalus. This finding enhances our understanding of cell development and disease mechanisms.
A new viewpoint in Genomic Psychiatry examines the interplay between genetic and environmental factors in schizophrenia risk. Researchers from Karolinska Institutet highlights findings from genome-wide association studies and epidemiological research, identifying genetic architecture and environmental risk factors like cannabis use and urban upbringing.
Researchers discovered that bacterial cells can “remember” temporary changes and pass these memories to offspring without altering DNA. Using Escherichia coli, they showed that temporary gene changes have lasting effects on the gene regulatory network, potentially bypassing antibiotic resistance. These findings are being tested with CRISPR technology.
📅 Upcoming Events
The Eurasia International Conference will be held in Minsk, Belarus, from October 1st to October 2nd, 2024. The conference provides a platform for professionals involved in Bioinformatics, Biomedicine, Biotechnology, and Computational Biology to exchange knowledge and gain insight into the state of the art in current technologies, techniques, and solutions in biotechnology.
This one-hour online workshop will be held on November 4, 2024, at 2:00 PM ET. The workshop will provide insights on BinaryCIF format and its use in data analysis. The BinaryCIF format enhances storage efficiency and speeds up parsing, and thus can be considered for large scale data analysis. This format is supported by key resources such as RCSB PDB, PDBe, and AlphaFold DB.
📚 Educational Corner
This is a short tutorial showing users how to convert the row name or row index of a Pandas dataframe into a column by using use Pandas reset_index() function.
This tutorial explains how to get the PID (process identifier) of a running process using Python. It covers different Python methods that users can utilize to obtain PID for the current, parent and child processes, and provides links for further reading.
This exercise is part of a course hosted by Costa laboratory at the Institute for Computational genomics, RWTH Aachen. It demonstrates how to obtain clinical and genomic data from the Cancer Genome Atlas (TGCA) and perform classical analysis important for clinical data using R. The exercise includes steps for downloading and processing data, conducting unsupervised analysis, and performing survival analysis, along with hands-on practice.
The repository is part of a two-day course on virtualization with containers. Day one focuses on using Docker to customize, store, manage, and share containerized environments. Day two covers the basics of the Snakemake workflow management system, guiding learners through creating a computational workflow using containers and a package manager.
🔗 Connect with Us
Stay connected and engage with us on social media for daily updates, discussions, and more!
📬 Subscribe
Don't miss an issue! Subscribe to the Bioinformer Weekly Roundup and receive the latest insights directly in your inbox.
We hope you enjoyed this week's edition of the Bioinformer Weekly Roundup. Feel free to share it with your colleagues and friends who share your passion for bioinformatics!
Disclaimer: The information provided in this newsletter is for educational and informational purposes only and does not constitute professional advice.
Contact: bioinformatics@zifornd.com
Copyright © 2024, Bioinformer Weekly Roundup. All rights reserved.
student at faculty of agriculture biotechnology department, Cairo university
3moVery informative