Warning: The NCBI web site requires JavaScript to function. more...
An official website of the United States government
The .gov means it's official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you're on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
Computational Biology Branch
NLM Intramural Research Program
NCBI Research Home
Research Groups Research Staff Seminar Schedule CBB Retreats
NCBI Computational Biology Branch
Research in the NCBI Computational Biology Branch (CBB) focuses on theoretical, analytical, and applied computational approaches to a broad range of fundamental problems in molecular biology and medicine.
Research Overview
The research program in the Computational Biology Branch is carried out by Senior Investigators, tenure track Investigators, Staff Scientists, Postdoctoral Fellows, and students. The program focuses on theoretical, analytical and applied approaches to a broad range of fundamental problems in molecular biology.
The expertise of the group is concentrated in sequence analysis, protein structure/function analysis, chemical informatics, and genome analysis. Research interests further cover a wide range of topics in computational biology and information science. These include, but are not limited to, database searching algorithms, sequence signal identification, mathematical models of evolution, statistical methods in virology, dynamic behavior of chemical reaction systems, statistical text-retrieval algorithms, protein structure and function prediction, comparative genomics, taxonomic trees, population genetics, and systems biology.
Many of the basic research projects conducted by CBB investigators serve to enhance and strengthen NCBI's suite of publicly available databases and software application tools. Collaborative research efforts, among NCBI investigators as well as with the external research community, have led to the development of innovative algorithms (BLAST, PSI-BLAST, VAST, and COGs), novel research approaches (text neighboring) and fundamental resources (PubChem and CDD) that have transformed the field of computational biology. Algorithms and applications currently under development have the potential to further advance scientific discovery.
Members of the CBB contribute significantly to the validity and reliability of NCBI's online resources by reviewing the quality and accuracy of the data deposited in the databases, as well as the accuracy of the information used to annotate the data. Members also provide leadership and guidance to the extramural community by planning and organizing scientific consortia to determine the most effective use of public sequence resources for large-scale or high-throughput experimental biology. Researchers collaborate to define new areas of research and identify appropriate computational mechanisms to address them.
Tools and Topics
- Analysis of Complete Genomes
- Clusters of Orthologous Groups (COGs)
- Genetics Analysis Software
- HistoneDB2.0 with variants
- LogOddsLogo
- SNPDelScore
Connect with NLM
National Library of Medicine 8600 Rockville Pike Bethesda, MD 20894
Web Policies FOIA HHS Vulnerability Disclosure
Help Accessibility Careers
Last updated: 2021-12-13T22:05:31Z
Computational Biology
View Principal Investigators in Computational Biology
As the field of biology has become more diverse and complex, so the field of computational biology has grown to support it. At the same time, as computational power and programming have become more sophisticated, computational biologists have stepped in as motivated and capable partners in the quest to understand disease. Today, computational biologists in the Intramural Research Program (IRP) take many different approaches to answer theoretical and experimental biological questions across a range of disciplines, including:
- Image Analysis : High-resolution optical imaging is a key to much of our biomedical research. Computers supply the advanced imaging methods and algorithms that allow us to view the human body from macro to nano.
- Biomodelling or Systems Biology : Computational biomodelling, or systems biology, is a computer-based simulation of a biological system used to understand and predict interactions within that system. Computers can model systems at any level, from populations to cellular networks and the sub-cellular worlds of signal transduction pathways and gene regulatory networks.
- Neuroscience : Computers are often compared to the brain, in terms of their ability to process information. So it’s no surprise that scientists use computers to further understand how this processing occurs.
- Bioinformatics : Biomedical science has experienced a recent increase in “-omics” research—genomics, proteomics, metabolomics, etc.—and as a result has embraced computational methods designed to simplify the analysis of the enormous amounts of data associated with this type of research.
IRP Institutes and Centers have embraced computational biology, with many now having a dedicated computing core.
IRP programs provide excellent opportunities for career training and development — from postbaccalaureate to postdoctoral fellowships. Learn more about these opportunities via the NIH Office of Intramural Training and Education .
This page was last updated on Wednesday, March 8, 2023
- Skip to primary navigation
- Skip to main content
- Skip to primary sidebar
- Skip to footer
Center for Computational Biology
Computational Biology PhD
The main objective of the Computational Biology PhD is to train the next generation of scientists who are both passionate about exploring the interface of computation and biology, and committed to functioning at a high level in both computational and biological fields.
The program emphasizes multidisciplinary competency, interdisciplinary collaboration, and transdisciplinary research, and offers an integrated and customizable curriculum that consists of two semesters of didactic course work tailored to each student’s background and interests, research rotations with faculty mentors spanning computational biology’s core disciplines, and dissertation research jointly supervised by computational and biological faculty mentors.
The Computational Biology Graduate Group facilitates student immersion into UC Berkeley’s vibrant computational biology research community. Currently, the Group includes over 46 faculty from across 14 departments of the College of Letters and Science, the College of Engineering, the College of Natural Resources, and the School of Public Health. Many of these faculty are available as potential dissertation research advisors for Computational Biology PhD students, with more available for participation on doctoral committees.
The First Year
The time to degree (normative time) of the Computational Biology PhD is five years. The first year of the program emphasizes gaining competency in computational biology, the biological sciences, and the computational sciences (broadly construed). Since student backgrounds will vary widely, each student will work with faculty and student advisory committees to develop a program of study tailored to their background and interests. Specifically, all first-year students must:
- Perform three rotations with Core faculty (one rotation with a non-Core faculty is acceptable with advance approval)
- Complete course work requirements (see below)
- Complete a course in the Responsible Conduct of Research
- Attend the computational biology seminar series
- Complete experimental training (see below)
Laboratory Rotations
Entering students are required to complete three laboratory rotations during their first year in the program to seek out a Dissertation Advisor under whose supervision dissertation research will be conducted. Students should rotate with at least one computational Core faculty member and one experimental Core faculty member. Click here to view rotation policy.
Course Work & Additional Requirements
Students must complete the following coursework in the first three (up to four) semesters. Courses must be taken for a grade and a grade of B or higher is required for a course to count towards degree progress:
- Fall and Spring semester of CMPBIO 293, Doctoral Seminar in Computational Biology
- A Responsible Conduct of Research course, most likely through the Department of Molecular and Cell Biology.
- STAT 201A & STAT 201B : Intro to Probability and Statistics at an Advanced Level. Note: Students who are offered admission and are not prepared to complete STAT 201A and 201B will be required to complete STAT 134 or PH 142 first.
- CS61A : The Structure and Interpretation of Computer Programs. Note: students with the equivalent background can replace this requirement with a more advanced CS course of their choosing.
- 3 elective courses relevant to the field of Computational Biology , one of which must be at the graduate level (see below for details).
- Attend the computational biology invited speaker seminar series. A schedule is circulated to all students by email and is available on the Center website. Starting with the 2023 entering class, CCB PhD students must enroll in CMPBIO 275: Computational Biology Seminar , which provides credit for this seminar series.
- 1) completion of a laboratory course at Berkeley with a minimum grade of B,
- 2) completion of a rotation in an experimental lab (w/ an experimental project), with a positive evaluation from the PI,
- a biological sciences undergraduate major with at least two upper division laboratory-based courses,
- a semester or equivalent of supervised undergraduate experimental laboratory-based research at a university,
- or previous paid or volunteer/internship work in an industry-based experimental laboratory.
Students are expected to develop a course plan for their program requirements and to consult with the Head Graduate Advisor before the Spring semester of their first year for formal approval (signature required). The course plan will take into account the student’s undergraduate training areas and goals for PhD research areas.
Satisfactory completion of first year requirements will be evaluated at the end of the spring semester of the first year. If requirements are satisfied, students will formally choose a Dissertation advisor from among the core faculty with whom they rotated and begin dissertation research.
Waivers: Students may request waivers for the specific courses STAT 201A, STAT 201B, and CS61A. In all cases of waivers, the student must take alternative courses in related areas so as to have six additional courses, as described above. For waiving out of STAT 201A/B, students can demonstrate they have completed the equivalent by passing a proctored assessment exam on Campus. For waiving out CS61A, the Head Graduate Advisor will evaluate student’s previous coursework based on the previous course’s syllabus and other course materials to determine equivalency.
Electives: Of the three electives, students are required to choose one course in each of the two following cluster areas:
- Cluster A (Biological Science) : These courses are defined as those for which the learning goals are primarily related to biology. This includes courses covering topics in molecular biology, genetics, evolution, environmental science, experimental methods, and human health. This category may also cover courses whose focus is on learning how to use bioinformatic tools to understand experimental data.
- Cluster B (Computational Sciences): These courses are defined as those for which the learning goals involve computing, inference, or mathematical modeling, broadly defined. This includes courses on algorithms, computing languages or structures, mathematical or probabilistic concepts, and statistics. This category would include courses whose focus is on biological applications of such topics.
In the below link we give some relevant such courses, but students can take courses beyond this list; for courses not on this list, the Head Graduate Advisor will determine to which cluster a course can be credited. For classes that have significant overlap between these two clusters, the department which offers the course may influence the decision of the HGA as to whether the course should be assigned to cluster A or B.
See below for some suggested courses in these categories:
Suggested Coursework Options
Second Year & Beyond
At the beginning of the fall of the second year, students begin full-time dissertation research in earnest under the supervision of their Dissertation advisor. It is anticipated that it will take students three (up to four) semesters to complete the 6 course requirement. Students are required to continue to participate annually in the computational biology seminar series.
Qualifying Examination
Students are expected to take and pass an oral Qualifying Examination (QE) by the end of the spring semester (June 15th) of their second year of graduate study. Students must present a written dissertation proposal to the QE committee no fewer than four weeks prior to the oral QE. The write-up should follow the format of an NIH-style grant proposal (i.e., it should include an abstract, background and significance, specific aims to be addressed (~3), and a research plan for addressing the aims) and must thoroughly discuss plans for research to be conducted in the dissertation lab. Click here for more details on the guidelines and format for the QE. Click here to view the rules for the composition of the committee and the form for declaring your committee.
Advancement to Candidacy
After successfully completing the QE, students will Advance to Candidacy. At this time, students select the members of their dissertation committee and submit this committee for approval to the Graduate Division. Students should endeavor to include a member whose research represents a complementary yet distinct area from that of the dissertation advisor (ie, biological vs computational, experimental vs theoretical) and that will be integrated in the student’s dissertation research. Click here to view the rules for the composition of the committee and the form for declaring your committee.
Meetings with the Dissertation Committee
After Advancing to Candidacy, students are expected to meet with their Dissertation Committee at least once each year.
Teaching Requirements
Computational Biology PhD students are required to teach at least two semesters (starting with Fall 2019 class), but may teach more. The requirement can be modified if the student has funding that does not allow teaching. Starting with the Fall 2019 class: At least one of those courses should require that you teach a section. Berkeley Connect or CMPBIO 293 can count towards one of the required semesters.
The Dissertation
Dissertation projects will represent scholarly, independent and novel research that contributes new knowledge to Computational Biology by integrating knowledge and methodologies from both the biological and computational sciences. Students must submit their dissertation by the May Graduate Division filing deadline (see Graduate Division for date) of their fifth–and final–year.
Special Requirements
Students will be required to present their research either orally or via a poster at the annual retreat beginning in their second year.
- Financial Support
The Computational Biology Graduate Group provides a competitive stipend as well as full payment of fees and non-resident tuition (which includes health care). Students maintaining satisfactory academic progress are provided full funding for five to five and a half years. The program supports students in the first year, while the PI/mentor provides support from the second year on. A portion of this support is in the form of salary from teaching assistance as a Graduate Student Instructor (GSI) in allied departments, such as Molecular and Cell Biology, Integrative Biology, Plant and Microbial Biology, Mathematics, Statistics or Computer Science. Teaching is part of the training of the program and most students will not teach more than two semesters, unless by choice.
Due to cost constraints, the program admits few international students; the average is two per year. Those admitted are also given full financial support (as noted above): stipend, fees and tuition.
Students are also strongly encouraged to apply for extramural fellowships for the proposal writing experience. There are a number of extramural fellowships that Berkeley students apply for that current applicants may find appealing. Please note that the NSF now only allows two submissions – once as an undergrad and once in grad school. The NSF funds students with potential, as opposed to specific research projects, so do not be concerned that you don’t know your grad school plans yet – just put together a good proposal! Although we make admissions offers before the fellowships results are released, all eligible students should take advantage of both opportunities to apply, as it’s a great opportunity and a great addition to a CV.
- National Science Foundation Graduate Research Fellowship (app deadlines in Oct)
- Hertz Foundation Fellowship (app deadline Oct)
- National Defense Science and Engineering Graduate Fellowship (app deadline in mid-Fall)
- DOE Computational Science Graduate Fellowship (Krell Institute) (app deadline in Jan)
CCB no longer requires the GRE for admission (neither general, nor subject). The GRE will not be seen by the review committee, even if sent to Berkeley.
PLEASE NOTE: The application deadline is Monday, December 2 , 2024, 8:59 PST/11:59 EST
We invite applications from students with distinguished academic records, strong foundations in the basic biological, physical and computational sciences, as well as significant computer programming and research experience. Admission for the Computational Biology PhD is for the fall semester only, and Computational Biology does not offer a Master’s degree.
We are happy to answer any questions you may have, but please be sure to read this entire page first, as many of your questions will be answered below or on the Tips tab.
IMPORTANT : Please note that it is not possible to select a specific PhD advisor until the end of the first year in the program, so contacting individual faculty about openings in their laboratories will not increase your chances of being accepted into the program. You will have an opportunity to discuss your interests with relevant faculty if you are invited to interview in February.
Undergraduate Preparation
Minimum requirements for admission to graduate study:
- A bachelor’s degree or recognized equivalent from an accredited institution.
- Minimum GPA of 3.0.
- Undergraduate preparation reflecting a balance of training in computational biology’s core disciplines (biology, computer science, statistics/mathematics), for example, a single interdisciplinary major, such as computational biology or bioinformatics; a major in a core discipline and a combination of interdisciplinary course work and research experiences; or a double major in core disciplines.
- Basic research experience and aptitude are key considerations for admission, so evidence of research experience and letters of recommendation from faculty mentors attesting to the applicant’s research experience are of particular interest.
- GRE – NOT required or used for review .
- TOEFL scores for international students (see below for details).
Application Requirements
ALL materials, including letters, are due December 2, 2024 (8:59 PST). More information is provided and required as part of the online application, so please create an account and review the application before emailing with questions (and please set up an account well before the deadline):
- A completed graduate application: The online application opens in early or mid-September and is located on the Graduate Division website . Paper applications are not accepted. Please create your account and review the application well ahead of the submit date , as it will take time to complete and requests information not listed here.
- A nonrefundable application fee: The fee must be paid using a major credit card and is not refundable. For US citizens and permanent residents, the fee is $135; US citizens and permanent residents may request a fee waiver as part of the online application. For all other students (international) the fee is $155 (no waivers, no exceptions). Graduate Admissions manages the fee, not the program, so please contact them with questions.
- Three letters of recommendation, minimum (up to five are accepted): Letters of recommendation must be submitted online as part of the Graduate Division’s application process. Letters are also due Dec. 2, so please inform your recommenders of this deadline and give them sufficient advance notice. It is your responsibility to monitor the status of your letters of recommendation (sending prompts, as necessary) in the online system.
- Transcripts: Unofficial copies of all relevant transcripts, uploaded as part of the online application (see application for details). Scanned copies of official transcripts are strongly preferred, as transcripts must include applicant and institution name and degree goal and should be easy for the reviewers to read (print-outs from online personal schedules can be hard to read and transcripts without your name and the institution name cannot be used for review). Do not send via mail official transcripts to Grad Division or Computational Biology, they will be discarded.
- Essays: Follow links to view descriptions of what these essays should include ( Statement of Purpose [2-3 pages], Personal Statement [1-2 pages]). Also review Tips tab for formatting advice.
- (Highly recommended) Applicants should consider applying for extramural funding, such as NSF Fellowships. These are amazing opportunities and the application processes are great preparation for graduate studies. Please see Financial Support tab.
- Read and follow all of the “Application Tips” listed on the last tab. This ensures that everything goes smoothly and you make a good impression on the faculty reviewing your file.
The GRE general test is not required. GRE subject tests are not required. GRE scores will not be a determining factor for application review and admission, and will NOT be seen by the CCB admissions committee. While we do not encourage anyone to take the exam, in case you decide to apply to a different program at Berkeley that does require them: the UC Berkeley school code is 4833; department codes are unnecessary. As long as the scores are sent to UC Berkeley, they will be received by any program you apply to on campus.
TOEFL/IELTS
Adequate proficiency in English must be demonstrated by those applicants applying from countries where English is not the official language. There are two standardized tests you may take: the Test of English as a Foreign Language (TOEFL), and the International English Language Testing System (IELTS). TOEFL minimum passing scores are 90 for the Internet-based test (IBT) , and 570 for the paper-based format (PBT) . The TOEFL may be waived if an international student has completed at least one year of full-time academic course work with grades of B or better while in residence at a U.S. university (transcript will be required). Please click here for more information .
Application Deadlines
The Application Deadline is 8:59 pm Pacific Standard Time, December 2, 2024 . The application will lock at 9pm PST, precisely. All materials must be received by the deadline. While rec letters can continue to be submitted and received after the deadline, the committee meets in early December and will review incomplete applications. TOEFL tests should be taken by or before the deadline, but self-reported scores are acceptable for review while the official scores are being processed. All submitted applications will be reviewed, even if materials are missing, but it may impact the evaluation of the application.
It is your responsibility to ensure and verify that your application materials are submitted in a timely manner. Please be sure to hit the submit button when you have completed the application and to monitor the status of your letters of recommendation (sending prompts, as necessary). Please include the statement of purpose and personal statement in the online application. While you can upload a CV, please DO NOT upload entire publications or papers. Please DO NOT send paper résumés, separate folders of information, or articles via mail. They will be discarded unread.
The Computational Biology Interview Visit dates are yet to be determined, but will be posted here once they are.
Top applicants who are being considered for admission will be invited to visit campus for interviews with faculty. Invitations will be made by early January. Students are expected to stay for the entire event, arriving in Berkeley by 5:30pm on the first day and leaving the evening of the final day. In the application, you must provide the names of between 7-10 faculty from the Computational Biology website with whom you are interested in conducting research or performing rotations. This helps route your application to our reviewers and facilitates the interview scheduling process. An invitation is not a guarantee of admission.
International students may be interviewed virtually, as flights are often prohibitively expensive.
Tips for the Application Process
Uploaded Documents: Be sure to put your name and type of essay on your essays ( Statement of Purpose [2-3 pages], Personal Statement [1-2 pages]) as a header or before the text, whether you use the text box or upload a PDF or Word doc. There is no minimum length on either essay, but 3 pages maximum is suggested. The Statement of Purpose should describe your research and educational background and aspirations. The Personal Statement can include personal achievements not necessarily related to research, barriers you’ve had to overcome, mentoring and volunteering activities, things that make you unique and demonstrate the qualities you will bring to the program.
Letters of Recommendation: should be from persons who have supervised your research or academic work and who can evaluate your intellectual ability, creativity, leadership potential and promise for productive scholarship. If lab supervision was provided by a postdoc or graduate student, the letter should carry the signature or support of the faculty member in charge of the research project. Note: the application can be submitted before all of the recommenders have completed their letters. It is your responsibility to keep track of your recommender’s progress through the online system. Be sure to send reminders if your recommenders do not submit their letters.
Extramural fellowships: it is to your benefit to apply for fellowships as they may facilitate entry into the lab of your choice, are a great addition to your CV and often provide higher stipends. Do not allow concerns about coming up with a research proposal before joining a lab prevent you from applying. The fellowships are looking for research potential and proposal writing skills and will not hold you to specific research projects once you have started graduate school.
Calculating GPA: Schools can differ in how they assign grades and calculate grade point averages, so it may be difficult for this office to offer advice. The best resource for calculating the GPA for your school is to check the back of the official transcripts where a guide is often provided or use an online tool. There are free online GPA conversion tools that can be found via an internet search.
Faculty Contact/Interests: Please be sure to list faculty that interest you as part of the online application. You are not required to contact any faculty in advance, nor will it assist with admission, but are welcome to if you wish to learn more about their research.
Submitting the application: To avoid the possibility of computer problems on either side, it is NOT advisable to wait until the last day to start and/or submit your application. It is not unusual for the application system to have difficulties during times of heavy traffic. However, there is no need to submit the application too early. No application will be reviewed before the deadline.
Visits: We only arrange one campus visit for recruitment purposes. If you are interested in visiting the campus and meeting with faculty before the application deadline, you are welcome to do so on your own time (we will be unable to assist).
Name: Please double check that you have entered your first and last names in the correct fields. This is our first impression of you as a candidate, so you do want to get your name correct! Be sure to put your name on any documents that you upload (Statement of Purpose, Personal Statement).
California Residency: You are not considered a resident if you hope to enter our program in the Fall, but have never lived in California before or are here on a visa. So, please do not mark “resident” on the application in anticipation of admission. You must have lived in California previously, and be a US citizen or Permanent Resident, to be a resident.
Faculty Leadership Head Graduate Advisor and Chair for the PhD & DE John Huelsenbeck ( [email protected] )
Associate Head Graduate Advisor for PhD & DE Liana Lareau ( [email protected] )
Equity Advisor Rasmus Nielsen ( [email protected] )
Director of CCB Elizabeth Purdom ( [email protected] )
Core PhD & DE Faculty ( link )
Staff support Student Services Advisor (GSAO): Kate Chase ( [email protected] )
Computational Systems Biology
With the advances of high-throughput experimental techniques, biomedical research is turning into information science. This requires the use of machine and deep-learning approaches, statistics and mathematical modelling. Individual cellular processes that comprise the interplay of several molecular players, such as cell signaling, can now be quantitatively characterized to allow a systematic view of biological processes. A better understanding of biological processes is crucial in order to provide robust predictive models that improve disease prognoses and treatment strategies. Our group is exploiting a large variety of data — multi-omics datasets, single-cell proteomics and mass spectrometry-based quantitative proteomics — to dissect the molecular mechanisms of cancer. Our goal is to develop predictive models for precision medicine.
- Omics: RNAseq, CNV, SNP, miRNA, SWATH–MS
- Single cell: Mass cytometry
- Clinical data: Survival outcome, treatment
- Literature: Publications
- Compound structure: SMILES, graph representation of molecules
- Networks: protein–protein interactions, pathways
- Machine learning: Deep learning, dimensionality reduction, clustering, classification, generative models
- Statistical inference: Probabilistic models, network inference
- Mathematical modeling: Stochastic hybrid models, Boolean networks
Research goals
Tumor heterogeneity.
- Leading drivers of cancer
- Molecular mechanisms
Personalized medicine
- Patient stratification
- Early diagnosis
- Targeted treatment
At IBM Research in Zurich, we develop novel approaches to analyze different molecular levels of high-throughput data. From single-cell to cell population-averaged data (proteomics, transcriptomics), we aim to integrate multiple layers of genome-scale information. This, in combination with clinical information and prior knowledge through literature mining, enables us to understand molecular mechanisms and explore applications to personalised medicine.
Our main research projects include, but not are limited to, studying cell-to-cell heterogeneity, integrative multi-omics analysis, dynamic network inference and robust biomarker discovery, most of which are applied in the case of cancer. Recently, we focused on anticancer drug modelling, specifically on leveraging biomarker information into generative models for de-novo drug design, attempting to bridge systems biology and anticancer drug discovery.
We gratefully acknowledge our numerous collaborations with university hospitals, research institutes and universities that work alongside our team in many of our projects.
Research topics
Interpretability for machine learning and computational biology.
Understanding real-world datasets is often challenging due to their size, complexity and/or poor knowledge about the problem to be tackled (i.e. electronic health records, OMICS data, etc.).
To achieve high accuracy for important tasks, equally complex machine/deep-learning models are usually used. In many situations, the decisions achieved by such automated systems can have significant—and potentially deleterious—consequences.
In biology and healthcare, interpretability becomes important for three main reasons.
For example, doctors and patient need to be confident about the decision achieved by a deployed model. By providing the rationale behind a decision could make a model more trustable.
2. Debugging
A model could return unexpected predictions, possibly indicating poor performance. Interpretability could help by shedding light on the causes behind poor performance, such as unfair dataset bias or poor model training.
3. Generating biological hypotheses
Surprising results do not always have a negative connotation. Rather, they might be due to the trained model leveraging a true pattern in the data that is unknown even to field experts, such as an unknown protein–protein interaction. Interpretable methods can potentially uncover these patterns, which can then be used as the basis for novel biological hypotheses.
Tumor cells exhibit a high degree of variability in terms of morphology, phenotype, metastatic potential and underlying molecular profile. This heterogeneity is present not only across different patients (inter-tumor heterogeneity) but also within the same tumor (intra-tumor heterogeneity) and has emerged as an inherent property of cancer.
Identifying the sources of heterogeneity and its implications in clinical outcomes, such as response to therapy or ability to metastasize, has become a cornerstone for the development of effective disease management strategies.
Read more about modeling spatial heterogeneity of the tumor microenvironment .
Read more about quantifying biological heterogeneity from single-cell data .
Multimodal data integration
Developing a predictive computational technology to exploit and integrate multiple molecular and clinical data.
Read more about multimodal data integration .
Research assets
Anticancer drug modelling for precision medicine.
Automatic text mining and analysis.
Pathway-induced multiple kernel learning.
CellCycleTRACER
A novel computational method to quantify cell cycle and cell volume variability.
Estimating the frequency of genetic alterations.
Consensus inference of molecular networks.
Technical resources
- PIMKL / MIMKL
- PaccMann / PaccMann RL
We gratefully acknowledge generous funding from SystemsX.ch, SNF and the European Union.
Publications
- Matteo Manica
- Ali Oskooei
- Molecular Pharmaceutics
- Scientific Reports
- Johanna Wagner
- Maria Anna Rapsomaniki
- Marcel Jan Thomas
- Frontiers in Immunology
- Roland Mathis
- Nature Machine Intelligence
- Joris Cadow
- npj Systems Biology and Applications
- Xiao Kang Lun
- Nature Communications
- Manuel Le Gallo
- Abu Sebastian
- Nature Electronics
Novel AI tools to accelerate cancer research
Deciphering breast cancer heterogeneity using machine learning.
Support Biology
Dei council and dei faculty committee, biology diversity community, mit biology catalyst symposium, honors and awards, employment opportunities, faculty and research, current faculty, in memoriam, areas of research, biochemistry, biophysics, and structural biology, cancer biology, cell biology, computational biology, human disease, microbiology, neurobiology, stem cell and developmental biology, core facilities, video gallery, faculty resources, undergraduate, why biology, undergraduate testimonials, major/minor requirements, general institute requirement, advanced standing exam, transfer credit, current students, subject offerings, research opportunities, biology undergraduate student association, career development, why mit biology, diversity in the graduate program, nih training grant, career outcomes, graduate testimonials, prospective students, application process, interdisciplinary and joint degree programs, living in cambridge, graduate manual: key program info, graduate teaching, career development resources, biology graduate student council, biopals program, postdoctoral, life as a postdoc, postdoc associations, postdoc testimonials, workshops for mit biology postdocs entering the academic job market, responsible conduct of research, postdoc resources, non-mit undergraduates, bernard s. and sophie g. gould mit summer research program in biology (bsg-msrp-bio), bsg-msrp-bio gould fellows, quantitative methods workshop, high school students and teachers, summer workshop for teachers, mit field trips, leah knox scholars program, additional resources, mitx biology, department calendar, ehs and facilities, graduate manual, resources for md/phd students, preliminary exam guidelines, thesis committee meetings, guidelines for graduating, mentoring students and early-career scientists, remembering stephen goldman (1962 – 2022).
gene expression and regulation •DNA, RNA, and protein sequence, structure, and interactions • molecular evolution • protein design • network and systems biology • cell and tissue form and function • disease gene mapping • machine learning • quantitative and analytical modeling
David Bartel
Christopher burge, olivia corradin, amy e. keating, eric s. lander, douglas lauffenburger, gene-wei li, adam c. martin, sergey ovchinnikov, david c. page, peter reddien, francisco j. sánchez-rivera, brandon (brady) weissbourd, jonathan weissman, harikesh s. wong, michael b. yaffe.
David Bartel studies molecular pathways that regulate eukaryotic gene expression by affecting the stability or translation of mRNAs.
In immune cells, X marks the spot(s)
Gene silencing tool has a need for speed
CHARMed collaboration creates a potent therapy candidate for fatal prion diseases
She’s fighting to stop the brain disease that killed her mother before it gets her
“Rosetta Stone” of cell signaling could expedite precision cancer medicine
Taking RNAi from interesting science to impactful new treatments
Q&A: Pulin Li on recreating development in the lab
Scientists develop a rapid gene-editing screen to find effects of cancer mutations
Loading metrics
Open Access
Essays articulate a specific perspective on a topic of broad interest to scientists.
See all article types »
A field guide to cultivating computational biology
Affiliations Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America, Center for Health AI, University of Colorado School of Medicine, Aurora, Colorado, United States of America
Affiliation Center for Health AI, University of Colorado School of Medicine, Aurora, Colorado, United States of America
Affiliations RIKEN Center for Integrative Medical Sciences Yokohama, Kanagawa, Japan, Human Technopole, Milan, Italy
Affiliation Department of Statistics, Institute of Mathematics, Statistics and Scientific Computing, University of Campinas, Campinas, Brazil
Affiliation RIKEN Center for Integrative Medical Sciences Yokohama, Kanagawa, Japan
Affiliation Department of Biomedical Engineering, Quantitative and Computational Biology, and Chemical Engineering & Materials Science, University of Southern California, Los Angeles, California, United States of America
Affiliation Pacific Northwest National Laboratory, Seattle, Washington, United States of America
Affiliation Melbourne Integrative Genomics, School of Mathematics and Statistics, The University of Melbourne, Melbourne, Australia
Affiliation Ellison Institute and Departments of Medicine/Oncology, Chemical Engineering, and Material Sciences, University of Southern California, Los Angeles, California, United States of America
Affiliation Department of Pathology and Laboratory Medicine, Weill-Cornell Medicine, New York, New York, United States of America
Affiliation Computational Biology Lab, New York Genome Center, New York, New York, United States of America
Affiliation Department of Applied Mathematics, University of California Merced, Merced, California, United States of America
Affiliation Institute of Computational Biology, Helmholtz Center Munich and Department of Mathematics, Technical University of Munich, Munich, Germany
Affiliation Charles Perkins Centre and School of Mathematics and Statistics, The University of Sydney, Australia
* E-mail: [email protected] (AEC); [email protected] (EJF)
Affiliation Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- [ ... ],
Affiliation Convergence Institute, Departments of Oncology, Biomedical Engineering, and Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, Maryland, United States of America
- [ view all ]
- [ view less ]
- Gregory P. Way,
- Casey S. Greene,
- Piero Carninci,
- Benilton S. Carvalho,
- Michiel de Hoon,
- Stacey D. Finley,
- Sara J. C. Gosline,
- Kim-Anh Lȇ Cao,
- Jerry S. H. Lee,
Published: October 7, 2021
- https://doi.org/10.1371/journal.pbio.3001419
- Reader Comments
Evolving in sync with the computation revolution over the past 30 years, computational biology has emerged as a mature scientific field. While the field has made major contributions toward improving scientific knowledge and human health, individual computational biology practitioners at various institutions often languish in career development. As optimistic biologists passionate about the future of our field, we propose solutions for both eager and reluctant individual scientists, institutions, publishers, funding agencies, and educators to fully embrace computational biology. We believe that in order to pave the way for the next generation of discoveries, we need to improve recognition for computational biologists and better align pathways of career success with pathways of scientific progress. With 10 outlined steps, we call on all adjacent fields to move away from the traditional individual, single-discipline investigator research model and embrace multidisciplinary, data-driven, team science.
Citation: Way GP, Greene CS, Carninci P, Carvalho BS, de Hoon M, Finley SD, et al. (2021) A field guide to cultivating computational biology. PLoS Biol 19(10): e3001419. https://doi.org/10.1371/journal.pbio.3001419
This is an open access article, free of all copyright, and may be freely reproduced, distributed, transmitted, modified, built upon, or otherwise used by anyone for any lawful purpose. The work is made available under the Creative Commons CC0 public domain dedication.
Funding: The authors thank the National Institutes of Health for research funding (R35 GM122547 to AEC, R01 HG010067 to CSG, and NCI U01CA253403 to EJF). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Competing interests: The authors have declared that no competing interests exist.
Biology in the digital era requires computation and collaboration. A modern research project may include multiple model systems, use multiple assay technologies, collect varying data types, and require complex computational strategies, which together make effective design and execution difficult or impossible for any individual scientist. While some labs, institutions, funding bodies, publishers, and other educators have already embraced a team science model in computational biology and thrived [ 1 – 7 ], others who have not yet fully adopted it risk severely lagging behind the cutting edge. We propose a general solution: “deep integration” between biology and the computational sciences. Many different collaborative models can yield deep integration, and different problems require different approaches ( Fig 1 ).
- PPT PowerPoint slide
- PNG larger image
- TIFF original image
Scientists who have little exposure to different fields build silos, in which they perform science without external input. To solve hard problems and to extend your impact, collaborate with diverse scientists, communicate effectively, recognize the importance of core facilities, and embrace research parasitism. In biologically focused parasitism, wet lab biologists use existing computational tools to solve problems; in computationally focused parasitism, primarily dry lab biologists analyze publicly available data. Both strategies maximize the use and societal benefit of scientific data.
https://doi.org/10.1371/journal.pbio.3001419.g001
In this article, we define computational science extremely broadly to include all quantitative approaches such as computer science, statistics, machine learning, and mathematics. We also define biology broadly, including any scientific inquiry pertaining to life and its many complications. A harmonious deep integration between biology and computer science requires action—we outline 10 immediate calls to action in this article and aim our speech directly at individual scientists, institutions, funding agencies, and publishers in an attempt to shift perspectives and enable action toward accepting and embracing computational biology as a mature, necessary, and inevitable discipline ( Box 1 ).
Box 1. Ten calls to action for individual scientists, funding bodies, publishers, and institutions to cultivate computational biology. Many actions require increased funding support, while others require a perspective shift. For those actions that require funding, we believe convincing the community of need is the first step toward agencies and systems allocating sufficient support
- Respect collaborators’ specific research interests and motivations Problem: Researchers face conflicts when their goals do not align with collaborators. For example, projects with routine analyses provide little benefit for computational biologists. Solution: Explicit discussion about interests/expertise/goals at project onset. Opportunity: Clearly defined expectations identify gaps, provide commitment to mutual benefit.
- Seek necessary input during project design and throughout the project life cycle Problem: Modern research projects require multiple experts spanning the project’s complexity. Solution: Engage complementary scientists with necessary expertise throughout the entire project life cycle. Opportunity: Better designed and controlled studies with higher likelihood for success.
- Provide and preserve budgets for computational biologists’ work Problem: The perception that analysis is “free” leads to collaborator budget cuts. Solution: When budget cuts are necessary, ensure that they are spread evenly. Opportunity: More accurate, reproducible, and trustworthy computational analyses.
- Downplay publication author order as an evaluation metric for computational biologists Problem: Computational biologist roles on publications are poorly understood and undervalued. Solution: Journals provide more equitable opportunities, funding bodies and institutions improve understanding of the importance of team science, scientists educate each other. Opportunity: Engage more computational biologist collaborators, provide opportunities for more high-impact work.
- Value software as an academic product Problem: Software is relatively undervalued and can end up poorly maintained and supported, wasting the time put into its creation. Solution: Scientists cite software, and funding bodies provide more software funding opportunities. Opportunity: More high-quality maintainable biology software will save time, reduce reimplementation, and increase analysis reproducibility.
- Establish academic structures and review panels that specifically reward team science Problem: Current mechanisms do not consistently reward multidisciplinary work. Solution: Separate evaluation structures to better align peer review to reward indicators of team science. Opportunity: More collaboration to attack complex multidisciplinary problems.
- Develop and reward cross-disciplinary training and mentoring Problem: Academic labs and institutions are often insufficiently equipped to provide training to tackle the next generation of biological problems, which require computational skills. Solution: Create better training programs aligned to necessary on-the-job skills with an emphasis on communication, encourage wet/dry co-mentorship, and engage younger students to pursue computational biology. Opportunity: Interdisciplinary students uncover important insights in their own data.
- Support computing and experimental infrastructure to empower computational biologists Problem: Individual computational labs often fund suboptimal cluster computing systems and lack access to data generation facilities. Solution: Institutions can support centralized compute and engage core facilities to provide data services. Opportunity: Time and cost savings for often overlooked administrative tasks.
- Provide incentives and mechanisms to share open data to empower discovery through reanalysis Problem: Data are often siloed and have untapped potential. Solution: Provide institutional data storage with standardized identifiers and provide separate funding mechanisms and publishing venues for data reuse. Opportunity: Foster new breed of researchers, “research parasites,” who will integrate multimodal data and enhance mechanistic insights.
- Consider infrastructural, ethical, and cultural barriers to clinical data access Problem: Identifiable health data, which include sensitive information that must be kept hidden, are distributed and disorganized, and thus underutilized. Solution: Leadership must enforce policies to share deidentifiable data with interoperable metadata identifiers. Opportunity: Derive new insights from multimodal data integration and build datasets with increased power to make biological discoveries.
Respect collaborators’ specific research interests and motivations
Computational biology hinges on mutual respect between scientists from different disciplines, and key elements of respect are understanding a colleague’s particular expertise and motivation. Individual scientists cannot stick strictly to their “home” discipline or treat one as working in service of another. Computationalists do not like to be seen as “just” running the numbers any more than biologists appreciate the perception that they are “just” a pair of hands that produced the data. Statistics, database structures, clinical informatics, genetics, epigenetics, genomics, proteomics, imaging, single-cell technologies, structure prediction, algorithm development, machine learning, and mechanistic modeling are all distinct fields. Biologists should not be offended if a particular idea does not fit a computational biologist’s research agenda, and computational scientists need to clearly communicate analysis considerations, approaches, and limitations.
Some institutions subsidize core facilities, which offer a variety of data collection and analytical services across a spectrum of data types. While some core services can carry out custom analyses and collect novel data types, others may be limited to standardized analysis and data collection pipelines due to their mission, bandwidth, or expertise. US National Laboratories are an unusual environment; unlike most core facilities, scientific careers are focused on technology development and benefit from internally allocated funding for their own research programs. As a community, we must value critical data and insights contributed by core facility staff by including them as authors.
Certain grant mechanisms can provide flexibility for computational biologists to develop new technologies, but the scope often focuses on method development, limiting the ability to collaborate on application-oriented projects. The current academic systems incentivize mechanism and translational discovery for biology but methodological or theoretical advances for computational sciences. This explains a common disconnect when collaborating: Projects that require routine use of existing methodology typically provide little benefit to the computational person’s academic record no matter how unique a particular dataset.
Therefore, we urge team science practitioners to conduct a transparent and explicit discussion of each investigator’s expertise, limitations, goals, expectations, deliverables, and publication strategies upfront. Matching research interests can facilitate dual submission of methodological and biological manuscripts, which provides leading roles for all investigators in the research team.
Seek necessary input during project design and throughout the project life cycle
Interdisciplinary projects without sufficient planning risk wasting time and resources. Scientists lacking particular expertise for a project should engage collaborators with such expertise from the beginning of the research project lifecycle [ 8 ]. Computational scientists may have critical insights that impact the scope of the biological questions, study design, analysis, and interpretation. Similarly, biologists’ early involvement may influence the algorithmic approach, data visualization, and refinement of analysis. The onset of a project is the ideal time to plot out feasibility, brainstorm solutions, and avoid costly missteps. It is also the time to establish clear communication and define expectations and responsibilities, in particular in the gray area between (experimental) data generation and (computational) data analysis.
Computational scientists should learn about data acquisition techniques and the factors that influence data quality, as well as the cost to collect new datasets. As the project progresses, collaborators must understand that data analysis is rarely turnkey: Optimal analysis requires iteration and engagement and can yield fruitful discovery and new questions to ask.
Provide and preserve budgets for computational biologists’ work
There is a common misconception that the lack of physical experimentation and laboratory supplies makes computational work automated, quick, and inexpensive. This stems from a sense that labor/time is a “free” resource in academia, and perhaps that each analysis should only need to be run a single time. In reality, even for well-established data types, analysis can often take as much or more time and effort to perform as generating the data at the bench. Moreover, it typically also requires pipeline optimization, software development and maintenance, and user interfaces so that methods remain usable beyond the scope of a single publication or project. Given that computational biologists often command higher salaries due to competition with industry, researchers should ensure space in the budget to support these efforts and computing costs for projects.
Scientists, institutions, and funders should also preserve budgets for collaborative researchers. When funding agencies impose budget cuts, computational collaborators’ budgets are often the first to go. This can substantially impact computational laboratories’ ability to provide independent projects and salaries to their trainees and staff members, especially considering that time spent providing preliminary analyses or ideas for the proposal cannot be recouped. In one computational biology laboratory, one-third of its collaborators’ funded grants cut their budget entirely, and another third cut their budget partially—by an average of 90% [ 9 ]. Although principal investigators are authorized to make budget changes at will, they should consider the impact of doing so on the long-term health of their relationship with their computational collaborators and the scientific community more generally. Institutions and agencies can help promote good behavior by a simple policy change: By default, budget cuts are distributed evenly; the lead investigator can then propose changes if needed. Societies should also advocate for this change at the funding agencies.
Downplay publication author order as an evaluation metric for computational biologists
Many high-impact papers have computational biologists in key authorship positions. In biology journals, these are customarily the first and last positions. Middle-author placements are very common for computational biologists, reflecting a methodological contribution on a paper to address a particular biological question. Co-first or co-senior authorships provide a means to provide credit when computational contributions are equally important to a paper, but such designations are often dramatically discounted in grant, hiring, promotion, and tenure assessments. For example, we have seen recent comments such as: “No recent first or senior author papers were listed (though one acknowledges several were ‘starred’ co-first authored)” ( Fig 2 ).
Everyone receives bad reviews; most are unavoidable. However, we notice a consistent trend with reviewers from disparate fields not being equipped to understand, evaluate, or appreciate the importance of computational biology contributions to team science. We collected these critiques via Twitter and received permission to directly quote.
https://doi.org/10.1371/journal.pbio.3001419.g002
Although the idea of breaking away from a linear, rank-ordered list of authors may seem unimaginable, we note that the biology journal standards of author order are not universal. In fields such as mathematics and physics, authors are listed alphabetically. These differences further complicate evaluation of computational publication records by biologists, who are typically not aware of the relative reputation of computational publishing venues (and vice versa). Other author ordering alternatives are possible, and scientific publishing has experienced dramatic change in recent years [ 10 ]. One option is for journals to formally encourage swapping the order of authors that have been designated as “equally contributing” via their display interfaces. Another might be to allow designations such as the “corresponding author on experimental aspects” and the “corresponding author on bioinformatics.”
Fundamentally, these issues will only be solved through educating institutional leaders and grant/paper reviewers on how to best understand and appreciate different kinds of scientists’ contributions. From calling computational biologists “research parasites” [ 11 ] to “mathematical service units” [ 2 ], it is clear that we have a long way to go ( Fig 2 ). Institutions can address this by altering promotion structures to depend upon an author’s contributions, regardless of order in the publication. One innovative solution was undertaken in 2021 by the Australian Research Council (ARC): A 500-word section was added to proposals for applicants to describe “research context” and “explain the relative importance of different research outputs and the expectations” in the specific discipline/s of the applican10 ten most relevant scientific results (articles, registered software, patents, and other items that can be chosen at the author’s discretion) and verifiable description of the impact of these items (citations, prizes, impact in public policies, and so on). The Royal Society Resume for Researchers was also designed to support and highlight a variety of research contributions [ 12 ]. This provides an opportunity to explain author order and other information helpful to assess merit within a discipline.
Value software as an academic product
Another complication to evaluating computational biologists is that their primary research output may not be papers but instead valuable software or data [ 13 ]. The US National Institutes of Health changed its biosketch format to explicitly encourage software products to be mentioned in its “Contributions to Science” section, where usage and impact can be described. It is becoming more common to publish software and data papers, aided in part by some journals creating article types specifically for software and data. Allowing updated versions of software to also be published provides academic credit for the largely thankless task of software maintenance over time.
Schemes for tracking the usage of software and data are a work in progress [ 14 ]. Citation counts for papers are a key metric used in career decisions but are inconsistent for software and data papers. Every scientist can help ensure that software and data creators receive credit for their work by citing the related papers rather than just mentioning the name of the software, website, or database (or omitting mention entirely), which leads to inconsistent tracking when evaluating investigator impact [ 15 ]. Journals can require that software and data are properly cited.
Quality software development and maintenance is crucial for efficient and reproducible data analysis and is a key ingredient for successful computational biology projects [ 16 ]. Community-level software ecosystems and pipeline-building tools have outsized impact because they standardize analyses and minimize software development costs for individual labs [ 13 ]. Yet, academic systems, which prioritize innovation, commonly undervalue software maintenance and development. Software products funded across projects, such as slurm or singularity at US National Laboratories, have provided valuable resources for the broader scientific community [ 17 , 18 ]. These projects were initially funded as independent tools, which supported independent computational biology labs. Software has a major impact on the progress of science but is underfunded by many agencies. The few agencies that do fund software maintenance are spread thin, given the global demand [ 19 ].
Establish academic structures and review panels that specifically reward team science
Wet lab biologists trained in traditional evaluation schemes can be quick to dismiss a researcher with a lack of a single driving biological question for the laboratory, many middle-authored papers, publication in computational conferences rather than journals, a low citation count or h-index (due to field-specific differences), or funding through grants led by others, leading to comments such as “How do people like you ever get last-author papers?” and “an overly strong reliance on collaborators” [ 2 ]. Computational scientists can dismiss a body of work as too applied, with not enough theory or conference papers that are the currency of the field. Evaluation panels should therefore include interdisciplinary researchers and be provided with guidelines about the challenges of interdisciplinary research. If, for example, middle authorships are seen by review committees as worthless (even, in some reported cases, detrimental) to a publication record, major contributors to the progress of science will go “extinct,” unable to get their research funded. It is therefore important for institutions to learn to appreciate the value of many small contributions versus a few large contributions.
We hope to reach a day where quantitative skills are so pervasive (and valued) that calling someone a “computational biologist” sounds just as odd as a “pipette biologist” [ 2 ]. In the meantime, establishing separate structures and review schemes is another approach to support this class of researchers. To promote faculty success, many institutions have created Systems Biology, Computational Biology, or Biomedical Informatics departments to provide an environment in which researchers can thrive and be evaluated by like-minded interdisciplinary colleagues. Similarly, interdisciplinary journals, grant review panels, and funding schemes support the publication and funding of work evaluated by peers. Some organizations are already promoting team science efforts by shifting cultures in recognition, funding, and career development, such as the UK Academy of Medical Sciences and National Research Council [ 20 ].
Likewise, institutions should take care to ensure that interdisciplinary researchers are recognized and rewarded for contributions across disciplines and departments, for example, through the evaluation system, additional compensation, and supplemental administrative staff. These researchers face more than their fair share of demands from collaborative roles on grants, administrative leadership, educational initiatives, thesis committees, and consultations. Many are jointly appointed to 2 or more departments, introducing additional service requirements that are invisible to each individual department [ 21 ]. Computational biologists can struggle to prioritize their research and these demands—even more so if they are in an underrepresented demographic, as such scientists face disproportionate demands on their time and disproportionate costs (being labeled uncollaborative and unsupportive) if they decline. Instead of ignoring or even negatively assessing team science contributions, promotion and tenure committees should include criteria that are indicators of success in collaborative-style work, including effort level as co-investigator on grants, core facility leadership, collaborative authorship contributions, and community service.
Develop and reward cross-disciplinary training and mentoring
As large datasets become increasingly common, computational expertise is a necessary asset for any biologist. The deepest insights often result from data analyzed by the biologists who designed and conducted the experiment. Likewise, an understanding of biological data is a necessary asset for any computer scientist working on biological data. The most impactful methodological leaps often result from a computer scientist with a deep understanding of the nuances and limitations of particular data. Institutions can help hybrid trainees bridge gaps through computational biology training, strategic organization of physical space, and team science–oriented evaluation metrics for mentors.
The first step in acquiring practical computational biology skills is to become comfortable with the basics in the unfamiliar domain: either programming or biology. Societies and other nonprofits can play a major role here; examples include the iBiology, Software Carpentries, CABANA in Latin America, and NEUBIAS in Europe [ 22 – 25 ]. Still, institutions must also explicitly support cross-disciplinary training and mentoring. Institutions can provide educational opportunities focused on basic programming, data analysis, and reproducibility, as well as core biology principles chosen by unique institutional strengths. Teaching collaborative and interdisciplinary skills ideally begins at the undergraduate level where courses should be redesigned to blend computer science and biology.
Institutions can also organize spaces to facilitate deep integration between biology and computer science. Combining wet and dry lab spaces encourages interactions among researchers with diverse expertise. For computational biology trainees who are embedded in laboratories with a focused and single-discipline research agenda, institutions and lab heads can seek mentors from complementary domains, and these external mentors should be rewarded institutionally. We urge institutional oversight for trainees hired by individual out-of-discipline PIs to be sure that they are learning in their chosen field and not treated as inexpensive hired hands. We also emphasize that not all labs require supplemental trainee supervision and mentorship; many lead investigators in computational biology are quite experienced enough to cover both sides ( Fig 3 ).
(A) Existing biology and computational labs can cooperate to provide complementary mentorship to computational biologists, and hybrid labs can provide sufficient support. Institutions must provide oversight to ensure that ill-equipped labs have trainees’ career goals in mind and do not view them as inexpensive labor. (B) Computational biology programs are at an advantage to provide necessary training to forge the biologists of the future.
https://doi.org/10.1371/journal.pbio.3001419.g003
A challenge of being interdisciplinary in many current academic structures is facing evaluation according to often conflicting, field-specific evaluation metrics that often fail to incentivize meaningful contributions to scientific progress. Mentors and departments should work to change evaluation systems to reward team science and, in the meanwhile, guide computational biology trainees to ensure that their career goals can be met. Furthermore, mentors and departments should emphasize communication skills, through explicit focus on cross-disciplinary conversations, brainstorming, and presentations. For example, training programs at National Labs typically require that trainees meet regularly with those outside of their primary area. Within academic departments, offices exclusively focused on scientific communication empower scientists to reach out across disciplines and to the public.
Support computing and experimental infrastructure to empower computational biologists
Computing infrastructure (whether local cluster, volunteer, or cloud-based) is essential to modern biomedical data science and requires compute power, data storage, networking, and system administration—all of which can introduce significant costs to research projects. For most institutions, this computing infrastructure is often not centrally provided nor well supported; it often falls through the cracks between institutional information technology (IT) and research offices. As a result, individual laboratories often independently (and inefficiently) fund their own infrastructure and perform systems administration for clusters or the cloud that are far outside their actual expertise. Institutions that wish to foster computational research should subsidize infrastructure costs and provide support staff to assist in its use. Some funding agencies already offer grant mechanisms that support computing hardware or cloud-based computing, such as the US STRIDES initiative, but others should also recognize these necessary costs and increase budgets to better support computational biology.
Historically, computational biologists relied on collaborators or public resources for datasets. It is now becoming more common for researchers with an informatics background to be running experiments, whether as a primary data source or to benchmark new technologies, validate algorithms, and test predictions and theories. Being able to bridge the dry lab–wet lab gap can have a major impact on a computational biologist’s success. Institutions and laboratory neighbors can flexibly offer laboratory space for computational biologists when needed, and they can offer appropriate biology mentors for computational biologists joining computational groups. Core facilities can extend support: for example, rather than only ingesting fully prepped samples or only providing instrument time, facilities can offer full service (from sample prep to data acquisition) to computational laboratories for an additional fee.
Provide incentives and mechanisms to share open data to empower discovery through reanalysis
“Data available upon request” is outdated, ineffective [ 26 ], and should be outlawed in publications (privacy reasons excepted, see next section). Data and code often become more valuable over time with new techniques or complementary new data. Indeed, some of the most challenging problems facing biological and clinical research are only possible through access to well-curated, large-scale datasets. Funding agencies, publishers, and the scientific community must continue to recognize both dissemination and reanalysis of reusable data as an impactful research output. Findable, accessible, interoperable, and reusable (FAIR) principles have already been adopted and even required by many funding agencies and organizations [ 27 ]. New “Resource” article types have been introduced in many journals, and entirely new journals (e.g., Gigascience, Scientific Data, and Data in Brief) launched to provide a publication route, yielding academic credit for the creation, organization, and sharing of useful datasets.
Explicit funding mechanisms aimed at data reuse can also facilitate algorithm development. These mechanisms can range from grant applications specifically targeting dataset reuse, a hackathon hosted by a disease foundation, or even a DREAM challenge in which the organizers curate a dataset specifically for algorithm development. These exercises also encourage the next generation of researchers to challenge, or augment, conclusions reported in original marker papers. Institutions can also fund internal algorithm development efforts to improve analyses of commonly generated data.
Data-generating biologists can share data types through a growing ecosystem of repositories [ 28 ]. Logistically, storing and disseminating data is a complex task, even for those with computational skills; it requires effective communication and knowledge encompassing IT, database structures, security/privacy, and desktop support that is distinct from analysis and the development of computational methods. Institutions and funding agencies should provide specialists in this and develop interfaces with consistent ontologies to ease the process and facilitate data reuse. Creating mechanisms for capturing metadata and incentives that support high-quality annotations can help to return more value from computational analyses [ 29 , 30 ]. To maximize the engagement of computational biologists, it is often helpful to provide data in a raw form as well as commonly used summary forms—controlling access as required to meet ethical and legal constraints [ 28 , 31 ].
Consider infrastructural, ethical, and cultural barriers to clinical data access
Institutions with large medical centers are realizing the promise of discovery from multimodal datasets of their unique patient cohorts, including primary clinical data as well as data from corresponding biospecimens using emerging molecular and imaging technologies. For example, deep learning has the potential to revolutionize pathology, but sufficient data are needed, often thousands of annotated images spanning patient groups. Infrastructural, ethical, and cultural considerations all place barriers to large-scale data access for computational research.
Institutions can play a role in supporting data access by providing centralized database structures that automatically ingest patient electronic health records and research-level data collected on these patients [ 32 ]. This requires careful attention to privacy, such that access to data is provided only to legally authorized researchers. It also requires policies in place governing whether patients will be notified about any potential health risks uncovered based on their data. Ethicists and clinical societies play a critical role in developing appropriate guidelines for such computational research on patient datasets that consider patient privacy and potential for improved public health. Inclusion of demographic information in these datasets and outreach to underrepresented populations is critical to overcome biases in data-driven discovery and introduces further ethical considerations.
The current system rewards scientists and institutions that closely guard clinical datasets, because exclusive analysis can benefit careers and the data can be monetized due to commercial demand for novel biomarkers and therapeutic targets. Strong leadership is essential to incentivize academic investigators to collect datasets in a coordinated way across disease groups to enable research for public benefit. Federated learning, where machine learning models can be trained on multiple datasets without actually sharing the raw data, may work around some barriers.
Conclusions
Visionaries a decade ago aspired to bridge the domains of computational sciences and biology [ 1 – 7 ]. Since then, computational biology has emerged as a mature scientific discipline. It’s time that traditional academic schemes built for the era of single-discipline biology evolve to support the interdisciplinary team science necessary for human progress.
As computational biology has grown rapidly over the last 30 years, we may ask what computational biology will look like 30 years from now, if cultures shift in the ways we propose.
We foresee the ever-growing amount of data and associated analytical questions outstripping the supply of researchers with computational skills. This unmet demand will drive wet lab biologists to use software, which funding bodies and publishers should support to become more optimized, reproducible, maintainable, and easy to use without relying on a dedicated computationalist. Likewise, as more data become open and interoperable and as contracted wet lab facilities grow, computationalists may not need to rely on dedicated wet lab scientists.
In the interim, as the field continues to develop new methods and biological data types, we foresee research parasitism and team science collaborations flourishing, and the scientists and institutions who focus most on cultivating computational biology will be rewarded.
Over time, the distinction between wet and dry biologists may fade, as both are working toward a common goal of understanding biology, and hybrid biologists [ 2 , 33 ] will emerge who understand the importance of collaboration and are equally adept at the experimental and the computational aspects of biology.
Acknowledgments
The authors thank the interdisciplinary researchers who have paved the way for computational biology.
- View Article
- PubMed/NCBI
- Google Scholar
- 12. Research culture: Résumé for Researchers. [cited 2021 Jul 23]. Available from: https://royalsociety.org/blog/2019/10/research-culture/
- 20. National Research Council (U.S.). Committee on the Science of Team Science, National Research Council (U.S.). Division of Behavioral and Social Sciences and Education. Enhancing the Effectiveness of Team Science. National Academies Press; 2015.
Browse Course Material
Course info, instructors.
- Prof. Christopher Burge
- Prof. Jeff Gore
- Prof. Wendy Gilbert
- Prof. Bruce Tidor
- Prof. Forest White
Departments
As taught in.
- Computational Biology
- Computation and Systems Biology
Topics in Computational and Systems Biology
Course description.
This is a seminar based on research literature. Papers covered are selected to illustrate important problems and approaches in the field of computational and systems biology, and provide students a framework from which to evaluate new developments.
The MIT Initiative in Computational and Systems Biology ( CSBi ) is a …
The MIT Initiative in Computational and Systems Biology ( CSBi ) is a campus-wide research and education program that links biology, engineering, and computer science in a multidisciplinary approach to the systematic analysis and modeling of complex biological phenomena. This course is one of a series of core subjects offered through the CSB Ph.D. program, for students with an interest in interdisciplinary training and research in the area of computational and systems biology.
You are leaving MIT OpenCourseWare
Phillip Compeau
Associate teaching professor assistant department head computational biology department carnegie mellon university.
- Phillip Compeau, Ph.D.
- Statement of Teaching Philosophy
Great Ideas in Computational Biology
- PreCollege Program in Computational Biology
- Fundamentals of Bioinformatics
- Programming for Scientists
- Essential Mathematics
- Professional Issues in Computational Biology
- Administration
- Programming for Lovers
- Bioinformatics Algorithms
- Biological Modeling
- SARS-CoV-2 Software Assignments
- Herbert A. Simon Award for Teaching Excellence in Computer Science
- Establishing a Computational Biology Flipped Classroom
- Incentivizing Course Participation with Charity Contributions
- Publications
- Curriculum Vitae
About the Course
Great ideas in computational biology (02-251) is a 12-unit course offered to students at Carnegie Mellon University who are interested in an introduction to the field of computational biology. It is taken by students of all years of study, but it is aimed at School of Computer Science first-year students who are interested in the computational biology major . I am unaware of a computationally rigorous introduction to computational biology students for first-year undergraduates at any other institution.
The course was taught to its first cohort in spring 2019 as a joint project with Carl Kingsford . I have taught the course as a solo project since that time, making lots of changes to the subjects taught in response to what students have reported particularly enjoying, and after consulting our faculty in the Computational Biology Department .
In spring 2021, I tried something a bit different by incentivizing course participation with donations to charity. The COVID-19 pandemic has meant that students find it difficult to engage in their courses, and I wrote about my efforts to reward them for doing so in my course.
In 2022, I won the Herbert A. Simon Award for Teaching Excellence in Computer Science for my work in teaching this course. This award is the top teaching honor bestowed by Carnegie Mellon’s School of Computer Science.
Student Testimonials
“Compeau is a legend. I disliked biology before I took his class and still do a little, but he made me fall in love with compbio this semester. This is one of the best classes for students looking to explore the applications of CS on other fields.”
“Professor Compeau spends a lot of time structuring his class towards a computational perspective to encourage computer scientists to delve into computational biology. He provides relevant biological information in a clear and concise way to ensure that students without a biology background can be on equal grounds with those who have extensive ones.”
“Incredible class and amazing lectures! This is definitely the best class I’ve taken at CMU.”
“The best Professor that I have ever met in my whole college life.”
“This class is just good stuff. Thank you for being an amazing professor who has changed my perspective on computer science in a positive way. I cant wait to delve more into Computational Biology after this course :)”
“This is a class that made me glad I chose to come to CMU. The class gives you a great taste of so much going on in computational biology – all accessible for students without any prior biology experience. Dr. Compeau is second to none. He is one of the most passionate professors I have ever met… Dr. Compeau was clearly rooting for our success not trying to break us! Studying remotely with a 16 hour time difference in Australia, I did not have any troubles getting the help I needed via Piazza. Seamless! Dr. Compeau changes the syllabus as computational biology advances – we discussed even advancements done in the past few months. There were several COVID assignments that gave detailed breakdowns and walked students through how researchers would have begun to study CoV2! Dr. Compeau also managed to get us some incredible guest lectures which I personally found fascinating! Overall, this class has had an incredible impact on my CMU career and recommend it to all!”
“Loved this course. I think its a shame that not everyone takes it.”
“Amazing class! I really feel that I learned a lot in terms of knowledge. The classroom environment is always very active and I’m impressed by how others think.”
“This is the best course I’ve taken at CMU thus far. It’s one of those classes that are hard enough to keep you engaged but not so hard that you feel hopeless, and the work you do feels meaningful and you understand why each piece exists, and how it helps you learn. The lectures are also amazing and I really liked them and showing up was very worth it. Also had the most interesting content of all the courses I’ve taken. 10/10 would recommend.”
Course Topics
The first half of the course provides a broad overview of topics in fundamental bioinformatics algorithms. Some of that material is adapted from my Bioinformatics Algorithms project.
The second half of the course samples beautiful ideas from a variety of different areas, taking a broad view of computational biology as the field continues to evolve. Some of these areas include biological network analysis, cell and systems modeling, DNA computing, automated science, and algorithms in nature.
I am providing the week-by-week lecture slides in PDF format below as a public resource. Some topics, such as how the fundamental algorithms miniasm, Clustal, and BLAST work, are presented as mini-lectures as part of the course recitations. If you are interested in these materials, or Bioinformatics Algorithms , please reach out to me.
Week 1: Assembling genomes
Click here to open slides in a new tab. Great ideas covered:
- de Bruijn graphs for genome assembly
Week 2: Finding hidden messages in DNA
- skew diagrams for locating replication origins in bacterial genomes
- Gibbs sampling and expectation maximization algorithms for finding motifs in transcription factor binding sites
Weeks 3-4: Sequence alignment
- Needleman-Wunsch algorithm for global sequence alignment
- Smith-Waterman algorithm for local sequence alignment
- affine sequence alignment
- hidden Markov models for multiple sequence alignment of variable sequences
Week 5: Evolutionary trees
- neighbor-joining algorithm
- Fitch algorithm for inferring ancestral states in a rooted tree
Week 6: Read mapping
- Suffix arrays
- Suffix trees
- The Burrows-Wheeler transform
Week 7: RNA-Sequencing
- Adapting Burrows-Wheeler based read mapping to the problem of RNA-sequencing
- Spliced alignment to find splice junctions
- RNA transcript assembly
- Expectation maximization to quantify transcript abundances
- Differential expression analysis
Week 8: Proteins
- ab initio and homology approaches for protein structure prediction
- Comparing protein structures globally using RMSD and the Kabsch algorithm
- Comparing protein structures locally with contact maps and Qres
- Peptide sequencing and peptide identification
Week 9: Systems biology
- Motifs in transcription factor networks, including negative autoregulation and feed-forward motifs.
- Particle-based reaction-diffusion models.
- The repressilator motif and biological oscillators
- Gillespie’s stochastic simulation algorithm for simulating chemical reactions in a well-mixed environment, applied to bacterial chemotaxis.
Week 10: Neural networks and the evolution of modularity
Click here to open slides in a new tab. Great Ideas covered:
- McCulloch-Pitts neurons.
- Perceptrons and linear separability.
- Encoding logical propositions as networks of perceptrons.
- The universality of NAND (and therefore networks of perceptrons) for representing any binary function.
- The Alon-Kashtan algorithm demonstrating spontaneous evolution of modularity in a biological model.
- A ten-minute overview of deep learning.
Week 11: Algorithms in nature
This week’s material is a little atypical for an introductory course in computational biology. It centers on the theme of algorithms implemented within nature, whether that is a bacterium, an insect, or a slime mold, that are used to solve problems heuristically. These algorithms are often distributed and based on probability, so that they are outside the realm of what students typically see in introductory computer science. Some of the problems that the algorithms are “solving” are in fact fundamental CS problems, and describing these algorithms led to surprising new contributions to computer science.
Special thanks in this section to Saket Navlakha , who provided some excellent advice on the most elegant ideas from this field to profile.
- E. coli ‘s chemotaxis exploration algorithm.
- Ant foraging algorithms.
- Slime mold transportation networks.
- A distributed heuristic for solving the maximal independent set problem implemented by Drosophila for choosing sensory organ precursor cells (SOPs) during development.
- A probabilistic approach based on Bloom filters and neural networks for solving the novel query problem, based on the Drosophila olfactory system.
Week 12: A sampling of “mini-great ideas” in computational biology
- Genome rearrangements and the fragile breakage model of genomes
- Cellular automata (Game of Life and the self-replicating Langton loops)
- Spatial game theory
- Turing patterns and the Gray-Scott model
Assessments
Students taking the course complete both theoretical and programming homework assignments. Starting in 2021, students also complete a collection of assignments that we developed to guide them through using existing open software to answer real research questions about SARS-CoV-2, and which I am providing to the community.
Finally, my favorite part about the course is that students all complete a project on applying computational analysis to a biological dataset of their own choosing. Students are required to write an essay detailing their work as well as deliver a short presentation to their peers. The projects that students produce are exceptional. Among a very strong group, I have chosen the following projects (with student permission) as stand-out examples of excellent essays for our course “ring of honor” shown below. These essays are not perfect, but they exemplify the superlative work that first year undergraduates can complete.
Course Project Ring of Honor
Zahra Ahmad, “Identification of Differentially Expressed Genes between Immune Phenotypes in Breast Cancer”
Viola Chen, “Investigation on difference in level of expression of cellular receptor for SARS-CoV-2, Angiotensin- converting Enzyme 2(ACE2), regarding age, gender and organ”
Shyam Sai, “Breast Cancer Diagnosis and Prognosis Using Keras and OpenCV”
Eunseo Sung , “The Effects of Gene Expression on Pulmonary Adenocarcinoma Progression”
Meghana Tandon , “Quantifying Stability of Common [Metagenomics] Distance Metrics and Similarity Scores”
Priya Varra , “Classifying White Blood Cell Images Using Deep Learning”
Brian Zhang , “Avian Migration on the spread of Influenza A (H7N9) in China”
Page Contents
| |||
EDITORIAL article
Editorial: big data and artificial intelligence for genomics and therapeutics – proceedings of the 19th annual meeting of the midsouth computational biology and bioinformatics society (mcbios).
- 1 National Center for Toxicological Research, United States Food and Drug Administration, Jefferson, AR, United States
- 2 Biology Department, University of Dallas, Irving, TX, United States
- 3 Department of BioMolecular Sciences, University of Mississippi, University, MS, United States
- 4 Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA, United States
Editorial on the Research Topic Big data and artificial intelligence for genomics and therapeutics – Proceedings of the 19th Annual Meeting of the MidSouth Computational Biology and Bioinformatics Society (MCBIOS)
The 19th annual MidSouth Computational Biology and Bioinformatics Society (MCBIOS) conference took place on the University of Dallas campus over the span of three days (March 15th–17th, 2023). The conference theme was “ Big Data and Artificial Intelligence for Genomics and Therapeutics ”. The program consisted of five keynote sessions, 10 breakout scientific sessions (with 30 invited speakers) and four hands-on workshops. The conference focused on cutting-edge topics in bioinformatics and computational biology, including application of big data and machine learning in precision medicine, machine learning and deep learning in safety evaluation and risk assessment, network medicine and drug discovery, single-cell multi-omics analysis, and computational approaches for immuno-oncology. In addition, some 60 posters were presented by scientists and trainees at the conference, highlighting the interdisciplinary nature of bioinformatics and its critical role in advancing biomedical research and healthcare. This research topic collects six articles contributed from outstanding presentations at this conference, five of which appear in Frontiers in Bioinformatics and one in Frontiers in Artificial Intelligence.
Huang et al. developed PAGER-scFGA, a novel tool for single-cell functional genomics analysis aimed at understanding cellular responses to stress and disease. PAGER-scFGA integrates cell functional annotations and gene-set enrichment analysis into existing single-cell analysis pipelines like Scanpy, enabling the identification of cell functions through enrichment of potential cell-marker genesets. It provides pathways, annotated gene lists, and gene signatures enriched in specific cell subsets, aiding in the characterization of molecular mechanisms underlying cell trajectories. Through a case study on mouse natural killer cells, PAGER-scFGA unveils stages and trajectories of NK cell maturation, highlighting cell cytotoxicity and response to interleukin signaling pathways. Overall, PAGER-scFGA offers a comprehensive knowledge map of gene networks and functional compartments, expected to be a vital tool for inferring cell functions and detecting molecular mechanisms in single-cell studies. The web app is publicly available for further exploration.
High-throughput sequencing has greatly increased gene expression data, now accessible in repositories like NCBI’s GEO. Efficiently querying and analyzing this vast data, especially for artificial intelligence (AI)/machine learning (ML), is challenging. BioVDB addresses this by serving as a specialized vector database for gene expression data, using Automatic Label Extraction (ALE) to annotate samples with metadata like age, sex, and tissue type. Created by Winnicki et al. , BioVDB includes 438,562 samples from eight microarray platforms, enhancing data retrieval with similarity search to identify patterns and infer missing labels. This feature supports rapid similarity analysis, crucial for uncovering biological phenomena. By integrating with AI/ML tools, BioVDB bridges the gap between large datasets and advanced computational analysis, fostering deeper insights and accelerating biological discovery.
The FDA Adverse Events Reporting System (FAERS) database is crucial for post-marketing drug safety reviews, but its effectiveness is hampered by inconsistent drug naming. This heterogeneity arises partly because the database includes both mandatory reports prepared by pharmaceutical companies and voluntary submissions from patients and healthcare professionals. Studies using FAERS without normalizing drug names can yield incomplete and inaccurate results. The study by Le et al. highlights the utility of RxNorm, a tool from the National Library of Medicine, for standardizing drug names in FAERS. By mapping prescription opioids to their RxNorm identifiers, the study demonstrated a significant reduction in name diversity, improving users’ ability to access information from the database accurately. With over 2,000 unique opioid names identified, RxNorm proved efficient in creating a uniform dataset. This method can enhance data quality in pharmacovigilance, offering a reliable foundation for diverse research applications.
The perspective by Patel et al. introduces the “No-Boundary Thinking” session on the Mid-South Computational Bioinformatics Society’s (MCBIOS) 19th annual meeting. No-boundary thinking fosters innovation by encouraging the scientific community to transcend traditional limitations and norms. This mindset allows for the discovery of new opportunities and the creation of groundbreaking solutions. The session highlighted this concept, particularly in the context of AI in bioinformatics. During the “No-Boundary Thinking” session, participants explored the future of AI in bioinformatics over the next 30 years. They discussed the integration of tools like ChatGPT to enhance bioinformatics research, facilitating communication among scientists from various disciplines to maximize the potential of AI algorithms. Additionally, the session emphasized the importance of educational outreach to inspire the next-generation of data scientists and informaticians. By embracing no-boundary thinking, the bioinformatics field can continue to evolve, driving forward with innovative and interdisciplinary approaches.
Type IV secretion systems (T4SSs) play a crucial role in the conjugation process of enteric bacteria, facilitating the transfer of plasmids that often contain antimicrobial resistance (AMR) genes. Algarni et al. developed a comprehensive plasmid transfer gene dataset, part of the FDA’s Virulence and Plasmid Transfer Factor Database, to analyze and compare conjugation-associated genes. By extracting relevant genes from GenBank, the study created tools to assess sequence diversity and compare plasmid transfer genes across different plasmid types. The plasmid transfer factor profile assessment and plasmid transfer factor comparison tools were instrumental in evaluating plasmids from GenBank and whole genome sequencing data. The findings demonstrated that these tools significantly enhance our understanding of how T4SSs and conjugative plasmids contribute to AMR gene dissemination, providing valuable insights for combating antimicrobial resistance.
Recent advances in deep learning have significantly improved contact map-based protein 3D structure prediction. Despite this, accessible software tools for beginners remain scarce. Baker et al. introduced GoFold, a user-friendly graphical user interface designed to simplify the contact map overlap (CMO) problem for novice users, aiding in better template selection. GoFold distinguishes itself with its intuitive design and thorough tutorials, making it accessible to those without extensive prior knowledge. It allows users to input proteins in various formats and visualize CMO’s to aid understanding of which overlaps are problematic. The authors compared GoFold’s capabilities to those of the state-of-the-art method, map_align, using PSICOV and CAMEO datasets, and showed GoFold’s superior performance for prediction of the correct protein fold and for alignment of target protein to template. Running efficiently on personal computers without third-party dependencies, GoFold is freely available for macOS, Linux, and Windows, promoting broad accessibility.
The papers included in this Research Topic provide examples of big data and artificial intelligence for genomics and therapeutics. They demonstrate the excellent studies from MCBIOS members in applying machine learning methods to extract valuable insights from big data.
Author contributions
HH: Writing–original draft, Writing–review and editing. IT-O: Writing–original draft, Writing–review and editing. RD: Writing–original draft, Writing–review and editing. ZQ: Writing–original draft, Writing–review and editing.
The author(s) declare that no financial support was received for the research, authorship, and/or publication of this article.
Conflict of interest
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.
Publisher’s note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.
Author disclaimer
This editorial reflects the views of the authors and does not necessarily reflect those of the U.S. Food and Drug Administration.
Keywords: big data, bioinformatics, artificial intelligence, genomics, therapeutics
Citation: Hong H, Toby-Ogundeji I, Doerksen RJ and Qin ZS (2024) Editorial: Big data and artificial intelligence for genomics and therapeutics – Proceedings of the 19th Annual Meeting of the MidSouth Computational Biology and Bioinformatics Society (MCBIOS). Front. Bioinform. 4:1470107. doi: 10.3389/fbinf.2024.1470107
Received: 25 July 2024; Accepted: 29 July 2024; Published: 09 August 2024.
Edited and reviewed by:
Copyright © 2024 Hong, Toby-Ogundeji, Doerksen and Qin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
*Correspondence: Huixiao Hong, [email protected]
Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.
share this!
August 9, 2024
This article has been reviewed according to Science X's editorial process and policies . Editors have highlighted the following attributes while ensuring the content's credibility:
fact-checked
peer-reviewed publication
Unlocking the potential of rapeseed: CRISPR edits for hybrid efficiency
by TranSpread
Hybrid production in rapeseed faces several significant challenges, primarily due to the complexities and limitations of current male sterility systems. Traditional methods often involve intricate management processes and are highly sensitive to environmental conditions, resulting in unstable and inefficient hybrid seed production.
Due to these issues, there is a pressing need for a more efficient, stable, and environmentally resilient system to improve hybrid production in rapeseed, ensuring higher yields and better adaptability to varying agricultural conditions.
Researchers from Zhejiang University and Jiaxing Academy of Agricultural Sciences developed a novel approach using CRISPR/Cas9 technology. This method targets the BnDAD1 gene, creating male-sterile lines in rapeseed, thus simplifying hybrid seed production. The findings are published in the journal Horticulture Research .
The study effectively disrupted the BnDAD1 gene, which plays a crucial role in the jasmonic acid biosynthesis pathway, using CRISPR/Cas9 technology. This disruption resulted in male sterility due to defects in anther dehiscence and pollen maturation in rapeseed.
By applying exogenous methyl jasmonate, the researchers were able to restore fertility in the male-sterile lines, enabling the production of F1 hybrid seeds. This new two-line system offers a more straightforward and efficient method for hybrid seed production compared to traditional systems, which often face environmental stability issues.
The male sterility induced by the CRISPR/Cas9 method proved to be stable and complete, independent of environmental conditions, making it a robust solution for hybrid rapeseed production. This innovative approach holds significant commercial potential, promising to enhance the efficiency and sustainability of rapeseed cultivation.
Dr. Lixi Jiang, lead researcher from Zhejiang University, stated, "Our findings present a significant advancement in rapeseed hybrid production. The use of CRISPR/Cas9 to induce male sterility simplifies the breeding process and holds great promise for enhancing rapeseed yield and sustainability."
This innovative approach can revolutionize hybrid seed production in rapeseed, providing a more efficient and stable method. The application of this technology can lead to increased yields and sustainability in rapeseed cultivation, addressing the growing global demand for vegetable oil.
Journal information: Horticulture Research
Provided by TranSpread
Explore further
Feedback to editors
Mars and Jupiter get chummy in the night sky. The planets won't get this close again until 2033
Saturday Citations: A rare misstep for Boeing; mouse jocks and calorie restriction; human brains in sync
23 hours ago
Flood of 'junk': How AI is changing scientific publishing
Aug 10, 2024
135-million-year-old marine crocodile sheds light on Cretaceous life
Aug 9, 2024
Researchers discover new material for optically-controlled magnetic memory
A new mechanism for shaping animal tissues
NASA tests deployment of Roman Space Telescope's 'visor'
How do butterflies stick to branches during metamorphosis?
Historic fires trapped in Antarctic ice yield key information for climate models
Hubble spotlights a supernova
Relevant physicsforums posts, neutron contamination threshold in tissue using linac.
Aug 8, 2024
Contradictory statements made by two different professors about IQ scores
Aug 2, 2024
New and Interesting Publications Relevant to the Origin of Life
The cass report (uk).
Jul 30, 2024
The predictive brain (Stimulus-Specific Error Prediction Neurons)
Jul 21, 2024
Understanding COVID Quarantine Guidance
Jul 19, 2024
More from Biology and Medical
Related Stories
Genetic editing of ideal small grain size genes enables fully mechanized hybrid rice breeding
Jun 3, 2024
Enhancing rapeseed maturity classification with hyperspectral imaging and machine learning
Mar 18, 2024
CRISPR enables one-step hybrid seed production in crops
Jul 8, 2020
Tomato blossoms unfold new insights: Key gene TM6 controls flower development
May 20, 2024
More efficient hybrid rice breeding achieved with female sterility technique
Dec 2, 2022
The mystery of pollen sterility and its reversion in pigeon pea revealed in a new study
Jul 1, 2020
Recommended for you
Testing the viability of using horse milk to make ice cream
Researchers develop AI model that predicts the accuracy of protein–DNA binding
3D laser printing with bioinks from microalgae
Researchers make breakthrough in understanding species abundance
Researchers find β-d-manno-heptoses are immune agonists across kingdoms
Let us know if there is a problem with our content.
Use this form if you have come across a typo, inaccuracy or would like to send an edit request for the content on this page. For general inquiries, please use our contact form . For general feedback, use the public comments section below (please adhere to guidelines ).
Please select the most appropriate category to facilitate processing of your request
Thank you for taking time to provide your feedback to the editors.
Your feedback is important to us. However, we do not guarantee individual replies due to the high volume of messages.
E-mail the story
Your email address is used only to let the recipient know who sent the email. Neither your address nor the recipient's address will be used for any other purpose. The information you enter will appear in your e-mail message and is not retained by Phys.org in any form.
Newsletter sign up
Get weekly and/or daily updates delivered to your inbox. You can unsubscribe at any time and we'll never share your details to third parties.
More information Privacy policy
Donate and enjoy an ad-free experience
We keep our content available to everyone. Consider supporting Science X's mission by getting a premium account.
E-mail newsletter
Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
- View all journals
- Explore content
- About the journal
- Publish with us
- Sign up for alerts
- Published: 08 July 2021
Computation and biology: a partnership
Nature Methods volume 18 , page 695 ( 2021 ) Cite this article
5470 Accesses
4 Citations
18 Altmetric
Metrics details
- Biological techniques
- Computational biology and bioinformatics
Computation goes hand in hand with contemporary biological studies. We describe a few trends in computational science that are helping drive new biological knowledge.
From instrument control, to data analysis and visualization, to simulation and prediction studies, to computational notebooks used for record keeping, computation is an essential part of the majority of contemporary biological studies. Since our very first issue in October 2004, Nature Methods has been publishing computational methods and tools, as well as software performance comparisons, that we think will be of broad interest to life scientists.
New experimental technologies create opportunities for computational scientists to develop tools to exploit the data generated by such techniques. For example, the diversity of nucleic acid sequencing technologies now available has necessitated new computational tools and/or adaptations of existing tools for analyzing the resulting data, as reviewed for single-cell RNA-seq analysis in this issue. The importance of software tools to analyze data from techniques as diverse as mass spectrometry-based metabolomics and magnetic resonance imaging cannot be underscored enough. Computational advances can fundamentally change and improve how biologists interact with their data, and even propel new biological insights. New algorithms being applied to cryo-electron microscopy datasets, for instance, are allowing researchers to reconstruct heterogeneous structural ensembles of protein complexes. Computational methods that help researchers integrate disparate types of datasets can yield new biological inferences, making such datasets even more valuable than the sum of their parts.
Machine learning and its buzzworthy relative deep learning are here to stay, already having had a profound impact across multiple fields, especially in image analysis (as exemplified by our recent Focus issue on deep learning in microscopy), in neuroscience, and in genomics. Models with more sophisticated architecture and improved expressivity and interpretability are being developed at a rapid clip. Their applications are being explored to tackle some of the most daunting challenges in modern data sciences, such as high dimensionality, noise and sparsity.
Biology can often be computation-intensive, especially when crunching huge datasets or running detailed simulations. Supercomputers are all too rare (and expensive), however, so computational scientists have taken advantage of workarounds. Distributed computing has facilitated the intensive process of protein structure prediction, as exemplified by Rosetta@home . Many algorithms have been implemented to run on graphics processing units (GPUs) to take advantage of parallel computing, allowing performance gains of several orders of magnitude. This has helped accelerate computationally demanding all-atom molecular dynamics simulations, for example, making millisecond and longer time scale simulations a reality. The rise of cloud computing , exemplified by popular platforms such as Galaxy , allows a researcher to choose from a plethora of tools using infrastructure maintained by a service provider.
Another newsworthy computational trend, quantum computing, is also poised to make an impact in biology, as discussed in a Comment in this issue. Quantum computing may yet help with difficult search problems that are so computationally intensive that they are essentially impossible with classical computers. One area with potential to greatly benefit from this technology is molecular design; another is the analysis of population-scale datasets. However, applying quantum computing in life science is not simply a matter of porting an existing algorithm to a quantum computer—it is a fundamentally different computational paradigm. And not all biological problems will benefit from quantum computing. These caveats and more are discussed in this month’s Technology Feature .
Over the years we have continually improved how we handle computational papers. Since the early days of the journal, we have asked reviewers to evaluate tool performance and code , and required that code central to new methods we publish be made available upon publication . We have also aimed to educate our authors and readers about the importance of naming software and ensuring that it is properly cited . With popular code repositories such as GitHub and DOI-minting repositories such as Zenodo now in common use, software tools may be made readily accessible and discoverable. We have also partnered with Code Ocean to facilitate the peer review of code, without reviewers (and, eventually, readers) needing to download a frustrating number of dependencies to run a program.
It’s clear that computational tools for biology are no longer solely the domain of experts: another growing trend has been that of packaging tools in containerized platforms with easy-to-use graphical user interfaces. This has enabled life scientists without serious computational know-how to apply sophisticated software tools in their research. But this ‘black boxing’ of computational tools comes with a risk: life scientists must ensure they are sufficiently knowledgeable to understand how the tools they apply function, lest they apply the tools improperly or without a full understanding of their caveats.
On the flip side, software developers must consider what biologists need to know about how a tool functions without overwhelming them with details. We believe that computational methods papers intended to be read and used by biologists should in fact be readable by biologists. This is why such papers we publish tend to have a relatively brief description of the underlying algorithm in the main part of the text, supported by figures that demonstrate strong validation and an application to a challenging biological problem. Computationally savvy readers who are interested in looking under the hood at the algorithmic details will still find them accessible in the Methods section and in the Supplementary Information.
Over the years, we have been pleased to see a culture shift towards greater openness , with many researchers now habitually making software tools for biological research freely available and providing source code and detailed documentation. Going beyond improving the reproducibility and transparency of results generated using computational tools, such practices are more likely to facilitate greater community uptake. Making code open source and providing appropriate licenses allows other developers to adapt and further build on existing code, advancing science. As always, we welcome your feedback about how we can improve our editorial standards and processes to better serve both computational tool developers and tool users.
Rights and permissions
Reprints and permissions
About this article
Cite this article.
Computation and biology: a partnership. Nat Methods 18 , 695 (2021). https://doi.org/10.1038/s41592-021-01215-2
Download citation
Published : 08 July 2021
Issue Date : July 2021
DOI : https://doi.org/10.1038/s41592-021-01215-2
Share this article
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
This article is cited by
Computational biology-based study of the molecular mechanism of spermidine amelioration of acute pancreatitis.
- Hongtao Duan
Molecular Diversity (2023)
Quick links
- Explore articles by subject
- Guide to authors
- Editorial policies
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.
- USC Libraries
- Research Guides
- Biological Sciences *
Biological Sciences *: Journals
- Reference & Background Information
- eBooks & Books
- More Resources
Finding Peer Reviewed Journals
Here are a few specific journals related to biological sciences. You can search for more using the USC Libraries Journal Search .
Biology Journals
- American Journal of Human Biology The American Journal of Human Biology is a peer-reviewed scientific journal covering human biology. It is the official publication of the Human Biology Association. The journal publishes original research, theoretical articles, reviews, and other communications connected to all aspects of human biology, health and disease.
- Annals of Human Biology Annals of Human Biology is a bimonthly academic journal that publishes review articles on human population biology, nature, development and causes of human variation. It is published by Taylor & Francis on behalf of the Society for the Study of Human Biology, of which it is the official journal.
- Biotechnology The Biotechnology Journal welcomes submissions from all areas of biotechnology and bioengineering research, including cell, tissue and organoid culture, disease models and therapeutics, synthetic biology and nanobiotechnology, metabolic engineering, bioenergy and bioprocesses, industrial processes, and plant and medical biotechnology.
- Journal of Biomedicine and Biotechnology The journal covers all the broad topics in Biotechnology and Biomedicine include 3D bioprinting, Agricultural biotechnology, Agro- and Food Biotechnology, Animal biotechnology, Antibody engineering, Applied Biotechnology, Applied microbiology, Assistive technologies, Biocatalysis, Biochemical Engineering/Bioprocess Engineering, Bioconversion, Biodegradation, Bioeconomics, Bioengineering, Biological systems engineering, Biomaterial implants, Biomaterials, Biomedical engineering, Biomimetics, Bionanotechnology, Bionics, Biopolymers, Bioremediation, Biotechnology applications, cardiology and cardiovascular diseases, Cardiovascular biomaterials, Chloroplast genome, CRISPR, Directed evolution, Environmental biotechnology, Environmental health, toxicology, Genetic engineering, etc.
- Molecular and Cellular Biochemistry Molecular and Cellular Biochemistry is a peer-reviewed scientific journal covering research in cellular biology and biochemistry.
Quantitative & Computational Biology Journals
- Cold Spring Harbor Symposia on Quantitative Biology Since 1933, major discoveries in biology--such as the structure of DNA, the genetic code, the polymerase chain reaction (PCR), and RNA interference (RNAi)--have been presented and debated at the Symposium held every summer at Cold Spring Harbor Laboratory in New York. Each Symposium focuses on a different and timely area of biological research and is attended by the leading figures in the field. The speakers are handpicked luminaries and rising stars who also publish a detailed discussion of the work they present in the annual Cold Spring Harbor Symposia on Quantitative Biology volume.
- PLoS Computational Biology PLOS Computational Biology is a monthly peer-reviewed open access scientific journal covering computational biology. It was established in 2005 by the Public Library of Science in association with the International Society for Computational Biology in the same format as the previously established PLOS Biology and PLOS Medicine.
- Journal of Computational Biology The Journal of Computational Biology is a monthly peer-reviewed scientific journal covering computational biology and bioinformatics.
- IEEE/ACM Transactions on Computational Biology and Bioinformatics IEEE/ACM Transactions on Computational Biology and Bioinformatics is a bimonthly peer-reviewed scientific journal. It is a joint publication of the IEEE Computer Society, Association for Computing Machinery, IEEE Computational Intelligence Society, and the IEEE Engineering in Medicine and Biology Society.
- Journal of Computational Biology and Bioinformatics Research The Journal of Computational Biology and Bioinformatics Research (JCBBR) is a peer reviewed journal. The journal is published quarterly and covers all areas of the subject such as genome annotation, comparative genomics, analysis of gene expression and structural bioinformatic approaches.
Marine Biology Journals
- Marine Biotechnology Marine Biotechnology welcomes high-quality research papers presenting novel data on the biotechnological applications of aquatic organisms. The journal publishes papers in the areas of molecular biology, genomics, proteomics, cell biology, and biochemistry, and particularly encourages submissions of papers related to genome biology such as linkage mapping, large-scale gene discoveries, QTL analysis, physical mapping, and comparative and functional genome analysis.
- Nature Biotechnology Nature Biotechnology is a monthly journal covering the science and business of biotechnology. It publishes new concepts in technology/methodology of relevance to the biological, biomedical, agricultural and environmental sciences as well as covers the commercial, political, ethical, legal, and societal aspects of this research. The first function is fulfilled by the peer-reviewed research section, the second by the expository efforts in the front of the journal.
Neuroscience Journals
- Clinical EEG and Neuroscience Clinical EEG and Neuroscience conveys clinically relevant research and development in electroencephalography and neuroscience.
- Journal of Cognitive Neuroscience The Journal of Cognitive Neuroscience investigates brain-behavior interactions and promotes a lively interchange among the mind sciences. Contributions address both descriptions of function and underlying brain events and reflect the interdisciplinary nature of the field, covering developments in neuroscience, neuropsychology, and cognitive psychology.
- Journal of Neuroscience The Journal of Neuroscience is a weekly peer-reviewed scientific journal published by the Society for Neuroscience. It covers empirical research on all aspects of neuroscience.
- Neuroscience Neuroscience is an international journal under the editorial direction of IBRO. Neuroscience publishes papers describing the results of original research on any aspect of the scientific study of the nervous system.
- Neuroscience and Biobehavioral Reviews Neuroscience & Biobehavioral Reviews is a peer-reviewed scientific journal covering behavioral neuroscience published by Elsevier. The journal publishes reviews, theoretical articles, and mini-reviews. It is an official journal of the International Behavioral Neuroscience Society.
- << Previous: Databases
- Next: More Resources >>
- Last Updated: Aug 8, 2024 2:15 PM
- URL: https://libguides.usc.edu/biology
Information
- Author Services
Initiatives
You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.
All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .
Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.
Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.
Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.
Original Submission Date Received: .
- Active Journals
- Find a Journal
- Proceedings Series
- For Authors
- For Reviewers
- For Editors
- For Librarians
- For Publishers
- For Societies
- For Conference Organizers
- Open Access Policy
- Institutional Open Access Program
- Special Issues Guidelines
- Editorial Process
- Research and Publication Ethics
- Article Processing Charges
- Testimonials
- Preprints.org
- SciProfiles
- Encyclopedia
Article Menu
- Subscribe SciFeed
- Recommended Articles
- Google Scholar
- on Google Scholar
Find support for a specific problem in the support section of our website.
Please let us know what you think of our products and services.
Visit our dedicated information section to learn more about MDPI.
Article Versions Notes
Action | Date | Notes | Link |
---|---|---|---|
article xml file uploaded | 11 August 2024 12:52 CEST | Original file | - |
article xml uploaded. | 11 August 2024 12:52 CEST | Update | |
article pdf uploaded. | 11 August 2024 12:52 CEST | Version of Record | |
article html file updated | 11 August 2024 12:54 CEST | Original file |
Zhang, Y.; Huang, Z.; Sun, L. Fragmentation Characteristics of Bubbles in a Throttling Hole Pipe. Micromachines 2024 , 15 , 1025. https://doi.org/10.3390/mi15081025
Zhang Y, Huang Z, Sun L. Fragmentation Characteristics of Bubbles in a Throttling Hole Pipe. Micromachines . 2024; 15(8):1025. https://doi.org/10.3390/mi15081025
Zhang, Yufeng, Zhijie Huang, and Lixia Sun. 2024. "Fragmentation Characteristics of Bubbles in a Throttling Hole Pipe" Micromachines 15, no. 8: 1025. https://doi.org/10.3390/mi15081025
Article Metrics
Further information, mdpi initiatives, follow mdpi.
Subscribe to receive issue release notifications and newsletters from MDPI journals
IMAGES
COMMENTS
Research Article. Derivation and simulation of a computational model of active cell populations: How overlap avoidance, deformability, cell-cell junctions and cytoskeletal forces affect alignment ... This Collection aims to increase the coverage of computational biology-related topics in Wikipedia by rewarding authors with a citable, PubMed ...
Atom. RSS Feed. Computational biology and bioinformatics is an interdisciplinary field that develops and applies computational methods to analyse large collections of biological data, such as ...
Read the latest Research articles in Computational biology and bioinformatics from Scientific Reports
A new study in Nature Methods describes a computational method named UTAG (unsupervised discovery of tissue architecture with graphs) that aims to identify and quantify higher-level tissue domains ...
Here, sixteen computational biologists around the globe present "A field guide to cultivating computational biology," focusing on solutions. Biology in the digital era requires computation and collaboration. A modern research project may include multiple model systems, use multiple assay technologies, collect varying data types, and require ...
The research program in the Computational Biology Branch is carried out by Senior Investigators, tenure track Investigators, Staff Scientists, Postdoctoral Fellows, and students. ... chemical informatics, and genome analysis. Research interests further cover a wide range of topics in computational biology and information science. These include ...
Biological science produces "big data" in varied formats, which necessitates using computational tools to process, integrate, and analyse data. Researchers using computational biology tools ...
Starting in 1990, by 1999, chromosome 22 became the first human chromosome to be completely sequenced. Computational biology refers to the use of data analysis, mathematical modeling and computational simulations to understand biological systems and relationships. [ 1] An intersection of computer science, biology, and big data, the field also ...
Computational biology faculty members are experts in analyzing proteomic and metabolomic data to find molecules and pathways that are differentially expressed or differentially modified between experimental conditions or patient groups. Our faculty has expertise in using novel algorithms to analyze proteomic and metabolomic data to address a ...
Computational biology is rapidly evolving. This places a premium on intellectual flexibility. Be aware of new trends and emerging advances, but don't be too swayed by fashion: most importantly, maintain intellectual flexibility and stay grounded in fluency in a variety of mathematical, computational, and statistical techniques. Approach a new domain with a sense of adventure and fearlessness ...
Biomodelling or Systems Biology: Computational biomodelling, or systems biology, is a computer-based simulation of a biological system used to understand and predict interactions within that system. Computers can model systems at any level, from populations to cellular networks and the sub-cellular worlds of signal transduction pathways and ...
The Computational Biology Graduate Group facilitates student immersion into UC Berkeley's vibrant computational biology research community. Currently, the Group includes over 46 faculty from across 14 departments of the College of Letters and Science, the College of Engineering, the College of Natural Resources, and the School of Public Health.
Research topics Interpretability for machine learning and computational biology Understanding real-world datasets is often challenging due to their size, complexity and/or poor knowledge about the problem to be tackled (i.e. electronic health records, OMICS data, etc.).
Mathematical and computational concepts, methods, and algorithms are being applied to all areas of basic and clinical life sciences, which results in a variety of research topics. Examples include: Biophysics and Structural Biology - Protein structure and function prediction; analysis of biological sequences and 3D structures; macromolecular ...
gene expression and regulation •DNA, RNA, and protein sequence, structure, and interactions • molecular evolution • protein design • network and systems biology • cell and tissue form and function • disease gene mapping • machine learning • quantitative and analytical modeling.
Computational design of soluble and functional membrane protein analogues. A deep learning approach enables accurate computational design of soluble and functional analogues of membrane proteins ...
Biology in the digital era requires computation and collaboration. A modern research project may include multiple model systems, use multiple assay technologies, collect varying data types, and require complex computational strategies, which together make effective design and execution difficult or impossible for any individual scientist.
The MIT Initiative in Computational and Systems Biology (CSBi) is a campus-wide research and education program that links biology, engineering, and computer science in a multidisciplinary approach to the systematic analysis and modeling of complex biological phenomena. This course is one of a series of core subjects offered through the CSB Ph.D ...
About the Course. Great ideas in computational biology (02-251) is a 12-unit course offered to students at Carnegie Mellon University who are interested in an introduction to the field of computational biology. It is taken by students of all years of study, but it is aimed at School of Computer Science first-year students who are interested in ...
4. The initial Gaussian height can (0.05kcal/mol), to decide the duration of the simulation to reach the equilibrium and after that frequency of the addition Gaussian (T_G) 0.09ps can be tuned as ...
Covers current research topics in computational molecular biology. Recent research papers presented from leading conferences such as the SIGACT International Conference on Compuational Molecular Biology (RECOMB). Topic areas include original research (both theoretical and experimental) in genomics, molecular sequence analysis, recognition of ...
This article is part of the Research Topic Big data and artificial intelligence for genomics and therapeutics ... The 19th annual MidSouth Computational Biology and Bioinformatics Society (MCBIOS) conference took place on the University of Dallas campus over the span of three days (March 15th-17th, 2023).
The application of IML methods has surged in prominence within computational biology research across a wide range of biological tasks 3,4,5,6,7:
An essential task in computational genomics involves transforming input sequences into their constituent k-mers. The quest for an efficient representation of k-mer sets is crucial for enhancing the scalability of bioinformatic analyses. One widely used method involves converting the k-mer set into a de Bruijn graph (dBG), followed by seeking a compact graph representation via the smallest path ...
The findings are published in the journal Horticulture Research. The study effectively disrupted the BnDAD1 gene, which plays a crucial role in the jasmonic acid biosynthesis pathway, using CRISPR ...
Research on algorithmic bias has highlighted how the application of machine learning techniques to corpora generated by humans is likely to reproduce the biases present in the corpora ().As large language models (LLMs) like ChatGPT have been recently opened to the broad public, with potential applications in journalism (), copywriting (), academia (), and other writing tasks (), and as they ...
Nature Methods 18 , 695 ( 2021) Cite this article. Computation goes hand in hand with contemporary biological studies. We describe a few trends in computational science that are helping drive new ...
PLOS Computational Biology is a monthly peer-reviewed open access scientific journal covering computational biology. It was established in 2005 by the Public Library of Science in association with the International Society for Computational Biology in the same format as the previously established PLOS Biology and PLOS Medicine.
Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.