CodeAvail

17+ Interesting Neural Network Project Ideas for All Levels

neural network project ideas

Do you know that neural networks are a key part of many amazing technologies, like voice assistants and self-driving cars? They are becoming more and more popular as they open up new opportunities in artificial intelligence.

For students, neural networks provide an interactive way to use technology, encouraging thinking and problem-solving abilities while learning new ideas in machine learning.

We have created a variety of neural network project ideas for people who are just starting out or who want to improve their skills. These projects give you hands-on experience and help you feel more confident in solving real-world problems.

In this blog, we will explain each project idea step-by-step, providing guidance and insights to support your journey in mastering neural networks. Let’s dive in and unleash the potential of this exciting field together!

Describe What a Neural Network Is

Table of Contents

A neural network is a computational model based on the anatomy and function of the human brain. It is made up of interconnected nodes, or neurons, that are grouped into layers. 

Information is processed through these neurons, each applying a mathematical operation to its inputs and passing the result to the next layer. 

Through training, the network learns to recognize patterns and relationships in data, enabling tasks such as image recognition, language translation, and predictive analytics. 

Neural networks have become a cornerstone of artificial intelligence, powering advancements in various fields by mimicking the brain’s ability to learn and adapt.

Neural Network Project Ideas Suitable All Levels – Beginners to Advanced

Here are some neural network project ideas suitable for all levels, from beginners to advanced:

neural network project ideas suitable all levels - beginners to advanced

Beginner-Level Neural Network Project Ideas 

1. Image Classification

Start with a simple project where you train a neural network to classify images like cats vs. dogs or different types of fruits. Using a dataset like CIFAR-10, you’ll learn how to preprocess images, design a basic neural network architecture with tools like TensorFlow or PyTorch , and evaluate its accuracy.

2. Handwritten Digit Recognition

Dive into the fundamentals of neural networks by building a model that can recognize handwritten digits (0-9). Utilize the MNIST dataset, a classic benchmark for this task. 

You’ll gain insights into data preprocessing, feature extraction, and implementing a basic feedforward neural network to achieve high accuracy in digit recognition.

3. Sentiment Analysis

Explore natural language processing (NLP) by creating a sentiment analysis model. Train a neural network to classify movie reviews or tweets as positive, negative, or neutral. 

This project introduces you to text preprocessing techniques, word embedding, and recurrent neural networks (RNNs) or convolutional neural networks (CNNs) for sequential data analysis.

4. Predicting House Prices

Apply regression techniques with a neural network to predict house prices based on features like square footage, number of bedrooms, and location. You’ll work with a housing dataset, preprocess the data, and design a regression model using tools like Keras or scikit-learn. This project enhances your understanding of regression analysis and model evaluation.

5. Spam Email Detection

Build a spam email classifier using a neural network to distinguish between legitimate emails and spam. Use a labeled email dataset and preprocess the text data to extract relevant features. Implement a neural network architecture such as a multilayer perceptron (MLP) or recurrent neural network (RNN) to classify emails as spam or effective.

6. Music Genre Classification

Explore audio data analysis by developing a neural network model to classify music into different genres, such as rock, jazz, or pop. Utilize audio feature extraction techniques and a labeled music dataset. 

Design a neural network architecture, possibly using recurrent neural networks (RNNs) or convolutional neural networks (CNNs), to learn patterns and characteristics unique to each genre, enabling accurate classification.

Intermediate-Level Neural Network Project Ideas

7. Object Detection in Images

Move beyond image classification and tackle the task of object detection. Develop a neural network model capable of identifying objects within an image and locating and drawing bounding boxes around them. 

Utilize datasets like COCO or PASCAL VOC and implement advanced architectures like YOLO (You Only Look Once) or Faster R-CNN for accurate detection.

8. Language Translation

Take on the challenge of building a neural machine translation system capable of translating text from one language to another. Implement a sequence-to-sequence model using recurrent neural networks (RNNs) or transformers. 

Train the model on parallel corpora like the Multi30k or WMT datasets to learn the mappings between different languages.

9. Facial Expression Recognition

Develop a neural network model to recognize facial expressions such as happiness, sadness, or anger from images or video frames. Utilize datasets like FER2013 or CK+ and design a convolutional neural network (CNN) architecture to capture spatial features from facial images. This project enhances your understanding of computer vision and emotion recognition.

10. Stock Price Prediction

Delve into financial forecasting by building a neural network model to predict stock prices. Use historical stock price data and relevant financial indicators as features. 

Design a time-series forecasting model, such as a recurrent neural network (RNN) or long short-term memory (LSTM) network, to capture temporal dependencies and make accurate predictions.

11. Music Generation

Explore the creative side of neural networks by developing a model capable of generating new music compositions. Train a recurrent neural network (RNN) or transformer model on a dataset of MIDI files representing musical sequences. The model learns the patterns and structures of music and generates novel compositions based on the learned patterns.

12. Video Action Recognition

Extend your knowledge of computer vision to video data by creating a neural network model for action recognition. 

Utilize datasets like UCF101 or Kinetics and design a spatiotemporal neural network architecture, such as 3D convolutional neural networks (3D CNNs) or Temporal Convolutional Networks (TCNs), to capture spatial and temporal features from video sequences. This project enables you to recognize and classify actions or activities within videos.

Advanced-Level Neural Network Project Ideas

13. Autonomous Vehicle Navigation

Embark on a sophisticated project to develop a neural network-based system for autonomous vehicle navigation. Integrate multiple sensors such as cameras, LiDAR, and radar to perceive the environment. 

Design a deep learning model capable of making real-time steering, acceleration, and braking decisions, enabling the vehicle to navigate safely in diverse traffic scenarios.

14. Medical Image Segmentation

Tackle the challenging task of medical image analysis by creating a neural network model for image segmentation. Focus on segmenting organs, tumors, or abnormalities in medical images like MRI or CT scans. 

Implement advanced segmentation architectures like U-Net or DeepLabv3+ to accurately delineate structures of interest for diagnostic and treatment planning purposes.

15. Generative Adversarial Networks (GANs) for Image Synthesis

Dive into generative modeling by building a GAN-based system for image synthesis. Train a generator and a discriminator network to compete against each other, generating realistic images from random noise. 

Experiment with architectures like DCGAN, StyleGAN, or CycleGAN to produce high-quality, diverse images with various art, design, and data augmentation applications.

16. Natural Language Understanding with Transformers

Explore state-of-the-art natural language processing (NLP) techniques by implementing transformer-based models like BERT or GPT. 

Develop a model capable of understanding and generating human-like text responses for tasks such as language translation, question answering, or dialogue generation. Fine-tune pre-trained transformer models on domain-specific data to achieve superior performance in NLP tasks.

17. Reinforcement Learning for Game Playing

Venture into the exciting field of reinforcement learning by creating an AI agent capable of mastering complex games. 

Implement deep reinforcement learning algorithms like Deep Q-Networks (DQN), Proximal Policy Optimization (PPO), or AlphaZero to train agents that achieve superhuman performance in games like chess, Go, or video games. Explore techniques for exploration, exploitation, and policy optimization to develop robust and adaptive game-playing agents.

18. Time Series Forecasting with Attention Mechanisms

Address the challenge of time series forecasting by leveraging attention mechanisms in neural networks. Develop models that dynamically focus on relevant temporal features while making predictions, allowing for more accurate and interpretable forecasts. 

Experiment with attention-based architectures like Transformer-based models or Temporal Attention Networks (TANs) to capture long-range dependencies and patterns in time series data for applications in finance, energy forecasting, and more.

These project ideas cover a range of applications and difficulty levels, allowing beginners to get started with foundational concepts while providing challenges for more experienced practitioners to explore advanced techniques and architectures.

Benefits of Using Neural Network Project Ideas

Using neural network project ideas offers several benefits:

Hands-on Learning

Neural network projects offer practical, hands-on experience, allowing learners to apply theoretical concepts in a real-world context.

Problem-Solving Skills

Engaging in projects fosters problem-solving abilities as learners encounter challenges and work through them to achieve desired outcomes.

Creativity and Innovation 

Developing neural network projects encourages creativity and innovation as learners explore novel ideas and solutions to tackle complex problems.

Deepened Understanding

Implementing projects deepens understanding of neural network concepts by providing opportunities to experiment with different architectures, algorithms, and techniques.

Portfolio Building

Completing projects allows learners to build a portfolio showcasing their skills and accomplishments, enhancing their credibility and employability in the field.

Community Engagement

Neural network projects often involve collaboration and knowledge-sharing within communities, providing opportunities for networking and peer support.

Ethical Considerations in Neural Network Projects

Ethical considerations are paramount in any field, and neural network projects are no exception. Here are some key ethical considerations to keep in mind:

  • Bias and Fairness: Neural network projects must address the potential for biases in data or algorithms that could lead to unfair outcomes or discrimination against certain groups.
  • Privacy Protection: Projects involving sensitive data must prioritize privacy protection measures to ensure the confidentiality and security of individuals’ information.
  • Transparency and Accountability: It’s crucial to maintain transparency in neural network projects, providing clear explanations of how decisions are made and being accountable for the consequences of those decisions.
  • Consent and Consent: Obtaining informed consent from individuals whose data is used in projects is essential to ensure ethical practices and respect for autonomy.
  • Social Impact Assessment: Consider the broader social implications of neural network projects, including their potential effects on society, culture, and human rights.
  • Continuous Monitoring and Evaluation: Projects should be regularly monitored and evaluated for ethical compliance, with mechanisms in place to address and rectify any ethical concerns that arise during development or deployment.

Wrapping Up

exploring neural network project ideas offers an exciting journey filled with learning, creativity, and innovation. 

From beginner-level projects focusing on fundamental concepts to advanced endeavors pushing the boundaries of technology, each project presents unique opportunities for growth and discovery. 

By engaging in these projects, individuals deepen their understanding of neural networks and develop valuable problem-solving skills, critical thinking, and collaboration.

 Moreover, ethical considerations are crucial, reminding us to approach these projects with responsibility, integrity, and a commitment to positive societal impact. 

Ultimately, neural network project ideas empower us to harness artificial intelligence’s power to better our world.

Frequently Asked Questions (FAQs)

1. what are the real-life applications of neural networks.

Neural networks find applications in diverse fields such as image and speech recognition, natural language processing, medical diagnosis, autonomous vehicles, financial forecasting, and recommendation systems, revolutionizing industries with their ability to learn and adapt.

2. How can I get started with neural network projects as a beginner?

To begin with, neural network projects, start by learning Python programming basics and familiarizing yourself with libraries like TensorFlow or PyTorch. Explore beginner-friendly tutorials and online courses to build foundational knowledge.

3. What are some resources for learning about neural networks and project implementation?

Some resources for learning about neural networks and project implementation include online courses like Coursera’s “Deep Learning Specialization,” books like “Neural Networks and Deep Learning” by Michael Nielsen, and interactive platforms like Kaggle and GitHub.

4. What are some potential career opportunities for individuals skilled in neural network project development?

Individuals skilled in neural network project development can pursue careers as machine learning engineers, data scientists, research scientists, AI engineers, software developers, or consultants in technology, healthcare, finance, and automotive.

5. Can neural networks be used for good in real life?

Absolutely! From medical diagnosis and weather forecasting to self-driving cars and language translation, neural networks are already impacting various fields positively. By exploring projects, you contribute to this potential while honing your skills for a future driven by AI.

Related Posts

Science Fair Project Ideas For 6th Graders

Science Fair Project Ideas For 6th Graders

When it comes to Science Fair Project Ideas For 6th Graders, the possibilities are endless! These projects not only help students develop essential skills, such…

Java Project Ideas For Beginners

Java Project Ideas for Beginners

Java is one of the most popular programming languages. It is used for many applications, from laptops to data centers, gaming consoles, scientific supercomputers, and…

  • Open access
  • Published: 16 January 2024

A review of graph neural networks: concepts, architectures, techniques, challenges, datasets, applications, and future directions

  • Bharti Khemani 1 ,
  • Shruti Patil 2 ,
  • Ketan Kotecha 2 &
  • Sudeep Tanwar 3  

Journal of Big Data volume  11 , Article number:  18 ( 2024 ) Cite this article

19k Accesses

12 Citations

Metrics details

Deep learning has seen significant growth recently and is now applied to a wide range of conventional use cases, including graphs. Graph data provides relational information between elements and is a standard data format for various machine learning and deep learning tasks. Models that can learn from such inputs are essential for working with graph data effectively. This paper identifies nodes and edges within specific applications, such as text, entities, and relations, to create graph structures. Different applications may require various graph neural network (GNN) models. GNNs facilitate the exchange of information between nodes in a graph, enabling them to understand dependencies within the nodes and edges. The paper delves into specific GNN models like graph convolution networks (GCNs), GraphSAGE, and graph attention networks (GATs), which are widely used in various applications today. It also discusses the message-passing mechanism employed by GNN models and examines the strengths and limitations of these models in different domains. Furthermore, the paper explores the diverse applications of GNNs, the datasets commonly used with them, and the Python libraries that support GNN models. It offers an extensive overview of the landscape of GNN research and its practical implementations.

Introduction

Graph Neural Networks (GNNs) have emerged as a transformative paradigm in machine learning and artificial intelligence. The ubiquitous presence of interconnected data in various domains, from social networks and biology to recommendation systems and cybersecurity, has fueled the rapid evolution of GNNs. These networks have displayed remarkable capabilities in modeling and understanding complex relationships, making them pivotal in solving real-world problems that traditional machine-learning models struggle to address. GNNs’ unique ability to capture intricate structural information inherent in graph-structured data is significant. This information often manifests as dependencies, connections, and contextual relationships essential for making informed predictions and decisions. Consequently, GNNs have been adopted and extended across various applications, redefining what is possible in machine learning.

In this comprehensive review, we embark on a journey through the multifaceted landscape of Graph Neural Networks, encompassing an array of critical aspects. Our study is motivated by the ever-increasing literature and diverse perspectives within the field. We aim to provide researchers, practitioners, and students with a holistic understanding of GNNs, serving as an invaluable resource to navigate the intricacies of this dynamic field. The scope of this review is extensive, covering fundamental concepts that underlie GNNs, various architectural designs, techniques for training and inference, prevalent challenges and limitations, the diversity of datasets utilized, and practical applications spanning a myriad of domains. Furthermore, we delve into the intriguing future directions that GNN research will likely explore, shedding light on the exciting possibilities.

In recent years, deep learning (DL) has been called the gold standard in machine learning (ML). It has also steadily evolved into the most widely used computational technique in ML, producing excellent results on various challenging cognitive tasks, sometimes even matching or outperforming human ability. One benefit of DL is its capacity to learn enormous amounts of data [ 1 ]. GNN variations such as graph convolutional networks (GCNs), graph attention networks (GATs), and GraphSAGE have shown groundbreaking performance on various deep learning tasks in recent years [ 2 ].

A graph is a data structure that consists of nodes (also called vertices) and edges. Mathematically, it is defined as G = (V, E), where V denotes the nodes and E denotes the edges. Edges in a graph can be directed or undirected based on whether directional dependencies exist between nodes. A graph can represent various data structures, such as social networks, knowledge graphs, and protein–protein interaction networks. Graphs are non-Euclidean spaces, meaning that the distance between two nodes in a graph is not necessarily equal to the distance between their coordinates in an Euclidean space. This makes applying traditional neural networks to graph data difficult, as they are typically designed for Euclidean data.

Graph neural networks (GNNs) are a type of deep learning model that can be used to learn from graph data. GNNs use a message-passing mechanism to aggregate information from neighboring nodes, allowing them to capture the complex relationships in graphs. GNNs are effective for various tasks, including node classification, link prediction, and clustering.

Organization of paper

The paper is organized as follows:

The primary focus of this research is to comprehensively examine Concepts, Architectures, Techniques, Challenges, Datasets, Applications, and Future Directions within the realm of Graph Neural Networks.

The paper delves into the Evolution and Motivation behind the development of Graph Neural Networks, including an analysis of the growth of publication counts over the years.

It provides an in-depth exploration of the Message Passing Mechanism used in Graph Neural Networks.

The study presents a concise summary of GNN learning styles and GNN models, complemented by an extensive literature review.

The paper thoroughly analyzes the Advantages and Limitations of GNN models when applied to various domains.

It offers a comprehensive overview of GNN applications, the datasets commonly used with GNNs, and the array of Python libraries that support GNN models.

In addition, the research identifies and addresses specific research gaps, outlining potential future directions in the field.

" Introduction " section describes the Introduction to GNN. " Background study " section provides background details in terms of the Evolution of GNN. " Research motivation " section describes the research motivation behind GNN. Section IV describes the GNN message-passing mechanism and the detailed description of GNN with its Structure, Learning Styles, and Types of tasks. " GNN Models and Comparative Analysis of GNN Models " section describes the GNN models with their literature review details and comparative study of different GNN models. " Graph Neural Network Applications " section describes the application of GNN. And finally, future direction and conclusions are defined in " Future Directions of Graph Neural Network " and " Conclusions " sections, respectively. Figure  1 gives the overall structure of the paper.

figure 1

The overall structure of the paper

Background study

As shown in Fig.  2 below, the evolution of GNNs started in 2005. For the past 5 years, research in this area has been going into great detail. Neural graph networks are being used by practically all researchers in fields such as NLP, computer vision, and healthcare.

figure 2

Year-wise publication count of GNN (2005–2022)

Graph neural network research evolution

Graph neural networks (GNNs) were first proposed in 2005, but only recently have they begun to gain traction. GNNs were first introduced by Gori [2005] and Scarselli [2004, 2009]. A node's attributes and connected nodes in the graph serve as its natural definitions. A GNN aims to learn a state embedding h v ε R s that encapsulates each node's neighborhood data. The distribution of the expected node label is one example of the output. An s-dimension vector of node v, the state embedding h v , can be utilized to generate an output O v , such as the anticipated distribution node name. The predicted node label (O v ) distribution is created using the state embedding h v [ 30 ]. Thomas Kipf and Max Welling introduced the convolutional graph network (GCN) in 2017. A GCN layer defines a localized spectral filter's first-order approximation on graphs. GCNs can be thought of as convolutional neural networks that have been expanded to handle graph-structured data.

Graph neural network evolution

As shown in Fig.  3 below, research on graph neural networks (GNNs) began in 2005 and is still ongoing. GNNs can define a broader class of graphs that can be used for node-focused tasks, edge-focused tasks, graph-focused tasks, and many other applications. In 2005, Marco Gori introduced the concept of GNNs and defined recursive neural networks extended by GNNs [ 4 ]. Franco Scarselli also explained the concepts for ranking web pages with the help of GNNs in 2005 [ 5 ]. In 2006, Swapnil Gandhi and Anand Padmanabha Iyer of Microsoft Research introduced distributed deep graph learning at scale, which defines a deep graph neural network [ 6 ]. They explained new concepts such as GCN, GAT, etc. [ 1 ]. Pucci and Gori used GNN concepts in the recommendation system.

figure 3

Graph Neural Network Evolution

2007 Chun Guang Li, Jun Guo, and Hong-gang Zhang used a semi-supervised learning concept with GNNs [ 7 ]. They proposed a pruning method to enhance the basic GNN to resolve the problem of choosing the neighborhood scale parameter. In 2008, Ziwei Zhang introduced a new concept of Eigen-GNN [ 8 ], which works well with several GNN models. In 2009, Abhijeet V introduced the GNN concept in fuzzy networks [ 9 ], proposing a granular reflex fuzzy min–max neural network for classification. In 2010, DK Chaturvedi explained the concept of GNN for soft computing techniques [ 10 ]. Also, in 2010, GNNs were widely used in many applications. In 2010, Tanzima Hashem discussed privacy-preserving group nearest neighbor queries [ 11 ]. The first initiative to use GNNs for knowledge graph embedding is R-GCN, which suggests a relation-specific transformation in the message-passing phases to deal with various relations.

Similarly, from 2011 to 2017, all authors surveyed a new concept of GNNs, and the survey linearly increased from 2018 onwards. Our paper shows that GNN models such as GCN, GAT, RGCN, and so on are helpful [ 12 ].

Literature review

In the Table  1 describe the literature survey on graph neural networks, including the application area, the data set used, the model applied, and performance evaluation. The literature is from the years 2018 to 2023.

Research motivation

We employ grid data structures for normalization of image inputs, typically using an n*n-sized filter. The result is computed by applying an aggregation or maximum function. This process works effectively due to the inherent fixed structure of images. We position the grid over the image, move the filter across it, and derive the output vector as depicted on the left side of Fig.  4 . In contrast, this approach is unsuitable when working with graphs. Graphs lack a predefined structure for data storage, and there is no inherent knowledge of node-to-neighbor relationships, as illustrated on the right side of Fig.  4 . To overcome this limitation, we focus on graph convolution.

figure 4

CNN In Euclidean Space (Left), GNN In Euclidean Space (Right)

In the context of GCNs, convolutional operations are adapted to handle graphs’ irregular and non-grid-like structures. These operations typically involve aggregating information from neighboring nodes to update the features of a central node. CNNs are primarily used for grid-like data structures, such as images. They are well-suited for tasks where spatial relationships between neighboring elements are crucial, as in image processing. CNNs use convolutional layers to scan small local receptive fields and learn hierarchical representations. GNNs are designed for graph-structured data, where edges connect entities (nodes). Graphs can represent various relationships, such as social networks, citation networks, or molecular structures. GNNs perform operations that aggregate information from neighboring nodes to update the features of a central node. CNNs excel in processing grid-like data with spatial dependencies; GNNs are designed to handle graph-structured data with complex relationships and dependencies between entities.

Limitation of CNN over GNN

Graph Neural Networks (GNNs) draw inspiration from Convolutional Neural Networks (CNNs). Before delving into the intricacies of GNNs, it is essential to understand why Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) may not suffice for effectively handling data structured as graphs. As illustrated in Fig.  5 , Convolutional Neural Networks (CNNs) are designed for data that exhibits a grid structure, such as images. Conversely, Recurrent Neural Networks (RNNs) are tailored to sequences, like text.

figure 5

Convolution can be performed if the input is an image using an n*n mask (Left). Convolution can't be achieved if the input is a graph using an n*n mask. (Right)

Typically, we use arrays for storage when working with text data. Likewise, for image data, matrices are the preferred choice. However, as depicted in Fig.  5 , arrays and matrices fall short when dealing with graph data. In the case of graphs, we require a specialized technique known as Graph Convolution. This approach enables deep neural networks to handle graph-structured data directly, leading to a graph neural network.

Fig. 5 illustrates that we can employ masking techniques and apply filtering operations to transform the data into vector form when we have images. Conversely, traditional masking methods are not applicable when dealing with graph data as input, as shown in the right image.

Graph neural network

Graph Neural Networks, or GNNs, are a class of neural networks tailored for handling data organized in graph structures. Graphs are mathematical representations of nodes connected by edges, making them ideal for modeling relationships and dependencies in complex systems. GNNs have the inherent ability to learn and reason about graph-structured data, enabling diverse applications. In this section, we first explained the passing mechanism of GNN (" Message Passing Mechanism in Graph Neural Network Section "), then described graphs related to the structure of graphs, graph types, and graph learning styles (" Description of GNN Taxonomy " Section).

Message passing mechanism in graph neural network

Graph symmetries are maintained using a GNN, an optimizable transformation on all graph properties (nodes, edges, and global context) (permutation invariances). Because a GNN does not alter the connectivity of the input graph, the output may be characterized using the same adjacency list and feature vector count as the input graph. However, the output graph has updated embeddings because the GNN modified each node, edge, and global-context representation.

In Fig. 6 , circles are nodes, and empty boxes show aggregation of neighbor/adjacent nodes. The model aggregates messages from A's local graph neighbors (i.e., B, C, and D). In turn, the messages coming from neighbors are based on information aggregated from their respective neighborhoods, and so on. This visualization shows a two-layer version of a message-passing model. Notice that the computation graph of the GNN forms a tree structure by unfolding the neighborhood around the target node [ 17 ]. Graph neural networks (GNNs) are neural models that capture the dependence of graphs via message passing between the nodes of graphs [ 30 ].

figure 6

How a single node aggregates messages from its adjacent neighbor nodes

The message-passing mechanism of Graph Neural Networks is shown in Fig. 7 . In this, we take an input graph with a set of node features X ε R d ⇥ |V| and Use this knowledge to produce node embeddings z u . However, we will also review how the GNN framework may embed subgraphs and whole graphs.

figure 7

Message passing mechanism in GNN

At each iteration, each node collects information from the neighborhood around it. Each node embedding has more data from distant reaches of the graph as these iterations progress. After the first iteration (k = 1), each node embedding expressly retains information from its 1-hop neighborhood, which may be accessed via a path in the length graph 1. [ 31 ]. After the second iteration (k = 2), each node embedding contains data from its 2-hop neighborhood; generally, after k iterations, each node embedding includes data from its k-hop setting. The kind of “information” this message passes consists of two main parts: structural information about the graph (i.e., degree of nodes, etc.), and the other is feature-based.

In the message-passing mechanism of a neural network, each node has its message stored in the form of feature vectors, and each time, the neighbor updates the information in the form of the feature vector [ 1 ]. This process aggregates the information, which means the grey node is connected to the blue node. Both features are aggregated and form new feature vectors by updating the values to include the new message.

Equations  4.1 and 4.2 shows that h denotes the message, u represents the node number, and k indicates the iteration number. Where AGGREGATE and UPDATE are arbitrarily differentiable functions (i.e., neural networks), and mN(u) is the “message,” which is aggregated from u's graph neighborhood N(u). We employ superscripts to identify the embeddings and functions at various message-passing iterations. The AGGREGATE function receives as input the set of embeddings of the nodes in the u's graph neighborhood N (u) at each iteration k of the GNN and generates a message. \({m}_{N(u)}^{k}\) . Based on this aggregated neighborhood information. The update function first UPDATES the message and then combines the message. \({m}_{N(u)}^{k}\) with the previous message \({h}_{u}^{(k-1)}\) of node, u to generate the updated message \({h}_{u}^{k}\) .

Description of GNN taxonomy

We can see from Fig. 8 below shows that we have divided our GNN taxonomy into 3 parts [ 30 ].

figure 8

Graph Neural Network Taxonomy

1. Graph Structures 2. Graph Types 3. Graph Learning Tasks

Graph structure

The two scenarios shown in Fig. 9 typically present are structural and non-structural. Applications involving molecular and physical systems, knowledge graphs, and other objects explicitly state the graph structure in structural contexts.

figure 9

Graph Structure

Graphs are implicit in non-structural situations. As a result, we must first construct the graph from the current task. For text, we must build a fully connected “a word” graph and a scene graph for images.

Graph types

There may be more information about nodes and links in complex graph types. Graphs are typically divided into 5 categories, as shown in Fig.  10 .

figure 10

Types of Graphs

Directed/undirected graphs

A directed graph is characterized by edges with a specific direction, indicating the flow from one node to another. Conversely, in an undirected graph, the edges lack a designated direction, allowing nodes to interact bidirectionally. As illustrated in Fig. 11 (left side), the directed graph exhibits directed edges, while in Fig. 11 (right side), the undirected graph conspicuously lacks directional edges. In undirected graphs, it's important to note that each edge can be considered to comprise two directed edges, allowing for mutual interaction between connected nodes.

figure 11

Directed/Undirected Graph

Static/dynamic graphs

The term “dynamic graph” pertains to a graph in which the properties or structure of the graph change with time. In dynamic graphs shown in Fig. 12 , it is essential to account for the temporal dimension appropriately. These dynamic graphs represent time-dependent events, such as the addition and removal of nodes and edges, typically presented as an ordered sequence or an asynchronous stream.

A noteworthy example of a dynamic graph can be observed in social networks like Twitter. In such networks, a new node is created each time a new user joins, and when a user follows another individual, a following edge is established. Furthermore, when users update their profiles, the respective nodes are also modified, reflecting the evolving nature of the graph. It's worth noting that different deep-learning libraries handle graph dynamics differently. TensorFlow, for instance, employs a static graph, while PyTorch utilizes a dynamic graph.

figure 12

Static/Dynamic Graph

Homogeneous/heterogeneous graphs

Homogeneous graphs have only one type of node and one type of edge shown in Fig. 13 (Left). A homogeneous graph is one with the same type of nodes and edges, such as an online social network with friendship as edges and nodes representing people. In homogeneous networks, nodes and edges have the same types.

Heterogeneous graphs shown in Fig. 13 (Right) , however, have two or more different kinds of nodes and edges. A heterogeneous network is an online social network with various edges between nodes of the ‘person’ type, such as ‘friendship’ and ‘co-worker.’ Nodes and edges in heterogeneous graphs come in several varieties. Types of nodes and edges play critical functions in heterogeneous networks that require further consideration.

figure 13

Homogeneous (Left), Heterogeneous (Right) Graph

Knowledge graphs

An array of triples in the form of (h, r, t) or (s, r, o) can be represented as a Knowledge Graph (KG), which is a network of entity nodes and relationship edges, with each triple (h, r, t) representing a single entity node. The relationship between an entity’s head (h) and tail (t) is denoted by the r. Knowledge Graph can be considered a heterogeneous graph from this perspective. The Knowledge Graph visually depicts several real-world objects and their relationships [ 32 ]. It can be used for many new aspects, including information retrieval, knowledge-guided innovation, and answering questions [ 30 ]. Entities are objects or things that exist in the real world, including individuals, organizations, places, music tracks, movies, and people. Each relation type describes a particular relationship between various elements similarly. We can see from Fig. 14 the Knowledge graph for Mr. Sundar Pichai.

figure 14

Knowledge graph

Transductive/inductive graphs

In a transductive scenario shown in Fig. 15 (up), the entire graph is input, the label of the valid data is hidden, and finally, the label for the correct data is predicted. However, with an inductive graph shown in Fig. 15 (down), we also input the entire graph (but only sample to batch), mask the valid data’s label, and forecast the valuable data’s label. The model must forecast the labels of the given unlabeled nodes in a transductive context. In the inductive situation, it is possible to infer new unlabeled nodes from the same distribution.

figure 15

Transductive/Inductive Graphs

Transductive Graph:

In the transductive approach, the entire graph is provided as input.

This method involves concealing the labels of the valid data.

The primary objective is to predict the labels for the valid data.

Inductive Graph:

The inductive approach still uses the complete graph, but only a sample within a batch is considered.

A crucial step in this process is masking the labels of the valid data.

The key aim here is to make predictions for the labels of the valid data.

Graph learning tasks

We perform three tasks with graphs: node classification, link prediction, and Graph Classification shown in Fig. 16 .

figure 16

Node Level Prediction (e.g., social network) (LEFT), Edge Level Prediction (e.g., Next YouTube Video?) (MIDDLE), Graph Level Prediction (e.g., molecule) (Right)

Node-level task

Node-level tasks are primarily concerned with determining the identity or function of each node within a graph. The core objective of a node-level task is to predict specific properties associated with individual nodes. For example, a node-level task in social networks could involve predicting which social group a new member is likely to join based on their connections and the characteristics of their friends' memberships. Node-level tasks are typically used when working with unlabeled data, such as identifying whether a particular individual is a smoker.

Edge-level task (link prediction)

Edge-level tasks revolve around analyzing relationships between pairs of nodes in a graph. An illustrative application of an edge-level task is assessing the compatibility or likelihood of a connection between two entities, as seen in matchmaking or dating apps. Another instance of an edge-level task is evident when using platforms like Netflix, where the task involves predicting the following video to be recommended based on viewing history and user preferences.

Graph-level

In graph-level tasks, the objective is to make predictions about a characteristic or property that encompasses the entire graph. For example, using a graph-based representation, one might aim to predict attributes like the olfactory quality of a molecule or its potential to bind with a disease-associated receptor. The essence of a graph-level task is to provide predictions that pertain to the graph as a whole. For instance, when assessing a newly synthesized chemical compound, a graph-level task might seek to determine whether the molecule has the potential to be an effective drug. The summary of all three learning tasks are shown in Fig. 17 .

figure 17

Graph Learning Tasks Summary

GNN models and comparative analysis of GNN models

Graph Neural Network (GNN) models represent a category of neural networks specially crafted to process data organized in graph structures. They've garnered substantial acclaim across various domains, primarily due to their exceptional capability to grasp intricate relationships and patterns within graph data. As illustrated in Fig.  18 , we've outlined three distinct GNN models. A comprehensive description of these GNN models, specifically Graph Convolutional Networks (GCN), Graph Attention Networks (GAT/GAN), and GraphSAGE models can be found in the reference [ 33 ]. In Sect. " GNN models ", we delve into these GNN models' intricacies; in " Comparative Study of GNN Models " section, we provide an in-depth analysis that explores their theoretical and practical aspects.

figure 18

Graph convolution neural network (GCN)

GCN is one of the basic graph neural network variants. Thomas Kipf and Max Welling developed GCN networks. Convolution layers in Convolutional Neural Networks are essentially the same process as 'convolution' in GCNs. The input neurons are multiplied by weights called filters or kernels. The filters act as a sliding window across the image, allowing CNN to learn information from nearby cells. Weight sharing uses the same filter within the same layer throughout the image; when CNN is used to identify photos of cats vs. non-cats, the same filter is employed in the same layer to detect the cat's nose and ears. Throughout the image, the same weight (or kernel or filter in CNNs) is applied [ 33 ]. GCNs were first introduced in “Spectral Networks and Deep Locally Connected Networks on Graphs” [ 34 ].

GCNs, which learn features by analyzing neighboring nodes, carry out similar behaviors. The primary difference between CNNs and GNNs is that CNNs are made to operate on regular (Euclidean) ordered data. GNNs, on the other hand, are a generalized version of CNNs with different numbers of node connections and unordered nodes (irregular on non-Euclidean structured data). GCNs have been applied to solve many problems, for example, image classification [ 35 ], traffic forecasting [ 36 ], recommendation systems [ 17 ], scene graph generation [ 37 ], and visual question answering [ 38 ].

GCNs are particularly well-suited for tasks that involve data represented as graphs, such as social networks, citation networks, recommendation systems, and more. These networks are an extension of traditional CNNs, widely used for tasks involving grid-like data, such as images. The key idea behind GCNs is to perform convolution operations on the graph data. This enables them to capture and propagate information through the nodes in a graph by considering both a node’s features and those of its neighboring nodes. GCNs typically consist of several layers, each performing convolution and aggregation steps to refine the node representations in the graph. By applying these layers iteratively, GCNs can capture complex patterns and dependencies within the graph data.

Working of graph convolutional network

A Graph Convolutional Network (GCN) is a type of neural network architecture designed for processing and analyzing graph-structured data. GCNs work by aggregating and propagating information through the nodes in a graph. GCN works with the following steps shown in Fig.  19 :

Initialization:

figure 19

Working of GCN

Each node in the graph is associated with a feature vector. Depending on the application, these feature vectors can represent various attributes or characteristics of the nodes. For example, in a social network, each node might represent a user, and the features could include user profile information.

Convolution Operation:

The core of a GCN is the convolution operation, which is adapted from convolutional neural networks (CNNs). It aims to aggregate information from neighboring nodes. This is done by taking a weighted sum of the feature vectors of neighboring nodes. The graph's adjacency matrix determines the weights. The resulting aggregated information is a new feature vector for each node.

Weighted Aggregation:

The graph's adjacency matrix, typically after normalization, provides weights for the aggregation process. In this context, for a given node, the features of its neighboring nodes are scaled by the corresponding values within the adjacency matrix, and the outcomes are then accumulated. A precise mathematical elucidation of this aggregation step is described in " Equation of GCN " section.

Activation function and learning weights:

The aggregated features are typically passed through an activation function (e.g., ReLU) to introduce non-linearity. The weight matrix W used in the aggregation step is learned during training. This learning process allows the GCN to adapt to the specific graph and task it is designed for.

Stacking Layers:

GCNs are often used in multiple layers. This allows the network to capture more complex relationships and higher-level features in the graph. The output of one GCN layer becomes the input for the next, and this process is repeated for a predefined number of layers.

Task-Specific Output:

The final output of the GCN can be used for various graph-based tasks, such as node classification, link prediction, or graph classification, depending on the specific application.

Equation of GCN

The Graph Convolutional Network (GCN) is based on a message-passing mechanism that can be described using mathematical equations. The core equation of a superficial, first-order GCN layer can be expressed as follows: For a graph with N nodes, let's define the following terms:

Equation  5.1 depicts a GCN layer's design. The normalized graph adjacency matrix A' and the nodes feature matrix F serve as the layer's inputs. The bias vector b and the weight matrix W are trainable parameters for the layer.

When used with the design matrix, the normalized adjacency matrix effectively smoothes a node’s feature vector based on the feature vectors of its close graph neighbors. This matrix captures the graph structure. A’ is normalized to make each neighboring node’s contribution proportional to the network's connectivity.

The layer definition is finished by applying A'FW + b to an element-wise non-linear function, such as ReLU. The downstream node classification task requires deep neural architectures to learn a complicated hierarchy of node attributes. This layer's output matrix Z can be routed into another GCN layer or any other neural network layer to do this.

Summary of graph convolution neural network (GCN) is shown in Table 2 .

Graph attention network (gat/gan).

Graph Attention Network (GAT/GAN) is a new neural network that works with graph-structured data. It uses masked self-attentional layers to address the shortcomings of past methods that depended on graph convolutions or their approximations. By stacking layers, the process makes it possible (implicitly) to assign various nodes in a neighborhood different weights, allowing nodes to focus on the characteristics of their neighborhoods without having to perform an expensive matrix operation (like inversion) or rely on prior knowledge of the graph's structure. GAT concurrently tackles numerous significant limitations of spectral-based graph neural networks, making the model suitable for both inductive and transductive applications.

Working of GAT

The Graph Attention Network (GAT) is a neural network architecture designed for processing and analyzing graph-structured data shown in Fig. 20 . GATs are a variation of Graph Convolutional Networks (GCNs) that incorporate the concept of attention mechanisms. GAT/GAN works with the following steps shown in Fig.  21 .

figure 20

How attention Coefficients updates

As with other graph-based models, GAT starts with nodes in the graph, each associated with a feature vector. These features can represent various characteristics of the nodes.

Self-Attention Mechanism and Attention Computation:

GAT introduces an attention mechanism similar to what is used in sequence-to-sequence models in natural language processing. The attention mechanism allows each node to focus on different neighbors when aggregating information. It assigns different attention coefficients to the neighboring nodes, making the process more flexible. For each node in the graph, GAT computes attention scores for its neighboring nodes. These attention scores are based on the features of the central node and its neighbors. The attention scores are calculated using a weighted sum of the features of the central node and its neighbors.

The attention scores determine how much each neighbor’s feature contributes to the aggregation for the central node. This weighted aggregation is carried out for all neighboring nodes, resulting in a new feature vector for the central node.

Multiple Attention Heads and Output Combination:

GAT often employs multiple attention heads in parallel. Each attention head computes its attention scores and aggregation results. These multiple attention heads capture different aspects of the relationships in the graph. The outputs from the multiple attention heads are combined, typically by concatenation or averaging, to create a final feature vector for each node.

Learning Weights and Stacking Layers:

Similar to GCNs, GATs learn weight parameters during training. These weights are learned to optimize the attention mechanisms and adapt to the specific graph and task. GATs can be used in multiple layers to capture higher-level features and complex relationships in the graph. The output of one GAT layer becomes the input for the next layer.

The learning weights capture the importance of node relationships and contribute to information aggregation during the neighborhood aggregation process. The learning process in GNNs also relies on backpropagation and optimization algorithms. The stacking of GNN layers enables the model to capture higher-level abstractions and dependencies in the graph. Each layer refines the node representations based on information from the previous layer.

The final output of the GAT can be used for various graph-based tasks, such as node classification, link prediction, or graph classification, depending on the application.

Equation for GAT

GAT’s main distinctive feature is gathering data from the one-hop neighborhood [ 30 ]. A graph convolution operation in GCN produces the normalized sum of node properties of neighbors. Equation  5.2 shows the Graph attention network, which \({h}_{i}^{(l+1)}\) defines the current node output, \(\sigma\) denotes the non-linearity ReLU function, \(j\varepsilon N\left(i\right)\) one hop neighbor, \({\complement }_{i,j}\) normalized vector, \({W}^{\left(l\right)}\) weight matrix, and \({h}_{j}^{(l)}\) denotes the previous node.

Why is GAT better than GCN?

We learned from the Graph Convolutional Network (GCN) that integrating local graph structure and node-level features results in good node classification performance. The way GCN aggregates messages, on the other hand, is structure-dependent, which may limit its use.

How attention coefficients update: the attention layer has 4 parts: [ 47 ]

A linear transformation: A shared linear transformation is applied to each node in the following Equation.

where h is a set of node features. W is the weight matrix. Z is the output layer node.

Attention Coefficients: In the GAT paradigm, it is crucial because every node can now attend to every other node, discarding any structural information. The pair-wise un-normalized attention score between two neighbors is computed in the next step. It combines the 'z' embeddings of the two nodes. Where || stands for concatenation, a learnable weight vector a(l) is put through a dot product, and a LeakyReLU is used [ 1 ]. Contrary to the dot-product attention utilized in the Transformer model, this kind of attention is called additive attention. The nodes are subsequently subjected to self-attention.

Softmax: We utilize the softmax function to normalize the coefficients over all j values, improving their comparability across nodes.

Aggregation: This process is comparable to GCN. The neighborhood embeddings are combined and scaled based on the attention scores.

Summary of graph attention network (GAT) is shown in Table 3 .

GraphSAGE represents a tangible realization of an inductive learning framework shown in Fig. 22 . It exclusively considers training samples linked to the training set's edges during training. This process consists of two main steps: “Sampling” and “Aggregation.” Subsequently, the node representation vector is paired with the vector from the aggregated model and passed through a fully connected layer with a non-linear activation function. It's important to note that each network layer shares a standard aggregator and weight matrix. Thus, the consideration should be on the number of layers or weight matrices rather than the number of aggregators. Finally, a normalization step is applied to the layer's output.

Two major steps:

Sample It describes how to sample a large number of neighbors.

Aggregator refers to obtaining the neighbor node embedding and then determining how to collect these embeddings and change your embedding information.

figure 22

Working of Graph SAGE Method

Working of graphSAGE model:

First, initializes the eigenvectors of all nodes in the input graph

For each node, get its sampled neighbor nodes

The aggregation function is used to aggregate the information of neighbor nodes

And combined with embedding, Update the same by a non-linear transformation embedding Express.

Types of aggregators

In the GraphSAGE method, 4 types of Aggregators are used.

Simple neighborhood aggregator:

Mean aggregator

LSTM Aggregator: Applies LSTM to a random permutation of neighbors.

Pooling Aggregator: It applies a symmetric vector function and converts adjacent vectors.

Equation of graphSAGE

W k , B k : is learnable weight matrices.

\({W}_{k}{B}_{k}=\) is learnable wight matrices.

\({h}_{v}^{0}= {x}_{v}:initial 0-\) the layer embeddings are equal to node features.

\({h}_{u}^{k-1}=\) Generalized Aggregation.

\({z}_{v }= {h}_{v}^{k}n\) : embedding after k layers of neighborhood aggregation.

\(\sigma\) – non linearity (ReLU).

Summary of graphSAGE is shown in Table 4 .

Comparative study of gnn models, comparison based on practical implementation of gnn models.

Table 5 describes the dataset statistics for different datasets used in literature for graph type of input. The datasets are CORA, Citeseer, and Pubmed. These statistics provide information about the kind of dataset, the number of nodes and edges, the number of classes, the number of features, and the label rate for each dataset. These details are essential for understanding the characteristics and scale of the datasets used in the context of citation networks. Comparison of the GNN model with equation in shown in Fig.  23 .

figure 23

Equations of GNN Models

Table 6 shows the performance results of different Graph Neural Network (GNN) models on various datasets. Table 6 provides accuracy scores for other GNN models on different datasets. Additionally, the time taken for some models to compute results is indicated in seconds. This information is crucial for evaluating the performance of these models on specific datasets.

Comparison based on theoretical concepts of GNN models are described in Table 7 .

Graph neural network applications, graph construction.

Graph Neural Networks (GNNs) have a wide range of applications spanning diverse domains, which encompass modern recommender systems, computer vision, natural language processing, program analysis, software mining, bioinformatics, anomaly detection, and urban intelligence, among others. The fundamental prerequisite for GNN utilization is the transformation or representation of input data into a graph-like structure. In the realm of graph representation learning, GNNs excel in acquiring essential node or graph embeddings that serve as a crucial foundation for subsequent tasks [ 61 ].

The construction of a graph involves a two-fold process:

Graph creation and

Learning about graph representations

Graph Creation: The generation of graphs is essential for depicting the intricate relationships embedded within diverse incoming data. With the varied nature of input data, various applications adopt techniques to create meaningful graphs. This process is indispensable for effectively communicating the structural nuances of the data, ensuring the nodes and edges convey their semantic significance, particularly tailored to the specific task at hand.

Learning about graph representations: The subsequent phase involves utilizing the graph expression acquired from the input data. In GNN-based Learning for graph representations, some studies employ well-established GNN models like GraphSAGE, GCN, GAT, and GGNN, which offer versatility for various application tasks. However, when faced with specific tasks, it may be necessary to customize the GNN architecture to address particular challenges more effectively.

The different application which is considered a graph

Molecular Graphs: Atoms and electrons serve as the basic building blocks of matter and molecules, organized in three-dimensional structures. While all particles interact, we primarily acknowledge a covalent connection between two stable atoms when they are sufficiently spaced apart. Various atom-to-atom bond configurations exist, including single and double bonds. This three-dimensional arrangement is conveniently and commonly represented as a graph, with atoms representing nodes and covalent bonds representing edges [ 62 ].

Graphs of social networks: These networks are helpful research tools for identifying trends in the collective behavior of individuals, groups, and organizations. We may create a graph that represents groupings of people by visualizing individuals as nodes and their connections as edges [ 63 ].

Citation networks as graphs: When they publish papers, scientists regularly reference the work of other scientists. Each manuscript can be visualized as a node in a graph of these citation networks, with each directed edge denoting a citation from one publication to another. Additionally, we can include details about each document in each node, such as an abstract's word embedding [ 64 ].

Within computer vision: We may want to tag certain things in visual scenes. Then, we can construct graphs by treating these things as nodes and their connections as edges.

GNNs are used to model data as graphs, allowing for the capture of complex relationships and dependencies that traditional machine learning models may struggle to represent. This makes GNNs a valuable tool for tasks where data has an inherent graph structure or where modeling relationships is crucial for accurate predictions and analysis.

Graph neural networks (GNNs) applications in different fields

Nlp (natural language processing).

Document Classification: GNNs can be used to model the relationships between words or sentences in documents, allowing for improved document classification and information retrieval.

Text Generation: GNNs can assist in generating coherent and contextually relevant text by capturing dependencies between words or phrases.

Question Answering: GNNs can help in question-answering tasks by representing the relationships between question words and candidate answers within a knowledge graph.

Sentiment Analysis: GNNs can capture contextual information and sentiment dependencies in text, improving sentiment analysis tasks.

Computer vision

Image Segmentation: GNNs can be employed for pixel-level image segmentation tasks by modeling relationships between adjacent pixels as a graph.

Object Detection: GNNs can assist in object detection by capturing contextual information and relationships between objects in images.

Scene Understanding: GNNs are used for understanding complex scenes and modeling spatial relationships between objects in an image.

Bioinformatics

Protein-Protein Interaction Prediction: GNNs can be applied to predict interactions between proteins in biological networks, aiding in drug discovery and understanding disease mechanisms.

Genomic Sequence Analysis: GNNs can model relationships between genes or genetic sequences, helping in gene expression prediction and sequence classification tasks.

Drug Discovery: GNNs can be used for drug-target interaction prediction and molecular property prediction, which is vital in pharmaceutical research.

Table 8 offers a concise overview of various research papers that utilize Graph Neural Networks (GNNs) in diverse domains, showcasing the applications and contributions of GNNs in each study.

Table 9 highlights various applications of GNNs in Natural Language Processing, Computer Vision, and Bioinformatics domains, showcasing how GNN models are adapted and used for specific tasks within each field.

Future directions of graph neural network

The contribution of the existing literature to GNN principles, models, datasets, applications, etc., was the main emphasis of this survey. In this section, several potential future study directions are suggested. Significant challenges have been noted, including unbalanced datasets, the effectiveness of current methods, text classification, etc. We have also looked at the remedies to address these problems. We have suggested future and advanced directions to address these difficulties regarding domain adaptation, data augmentation, and improved classification. Table 10 displays future directions.

Imbalanced Datasets—Limited labeled data, domain-dependent data, and imbalanced data are currently issues with available datasets. Transfer learning and domain adaptation are solutions to these issues.

Accuracy of Existing Systems/Models—can utilize deep learning models such as GCN, GAT, and GraphSAGE approaches to increase the efficiency and precision of current systems. Additionally, training models on sizable, domain-specific datasets can enhance performance.

Enhancing Text Classification: Text classification poses another significant challenge, which is effectively addressed by leveraging advanced deep learning methodologies like graph neural networks, contributing to the improvement of text classification accuracy and performance.

The above Table  10 describes the research gaps and future directions presented in the above literature. These research gaps and future directions highlight the challenges and proposed solutions in the field of text classification and structural analysis.

Table 11 provides an overview of different research papers, their publication years, the applications they address, the graph structures they use, the graph types, the graph tasks, and the specific Graph Neural Network (GNN) models utilized in each study.

Conclusions

Graph Neural Networks (GNNs) have witnessed rapid advancements in addressing the unique challenges presented by data structured as graphs, a domain where conventional deep learning techniques, originally designed for images and text, often struggle to provide meaningful insights. GNNs offer a powerful and intuitive approach that finds broad utility in applications relying on graph structures. This comprehensive survey on GNNs offers an in-depth analysis covering critical aspects such as GNN fundamentals, the interplay with convolutional neural networks, GNN message-passing mechanisms, diverse GNN models, practical use cases, and a forward-looking perspective. Our central focus is on elucidating the foundational characteristics of GNNs, a field teeming with contemporary applications that continually enhance our comprehension and utilization of this technology.

The continuous evolution of GNN-based research has underscored the growing need to address issues related to graph analysis, which we aptly refer to as the frontiers of GNNs. In our exploration, we delve into several crucial recent research domains within the realm of GNNs, encompassing areas like link prediction, graph generation, and graph categorization, among others.

Availability of data and materials

Not applicable.

Abbreviations

Graph Neural Network

Graph Convolution Network

Graph Attention Networks

Natural Language Processing

Convolution Neural Networks

Recurrent Neural Networks

Machine Learning

Deep Learning

Knowledge Graph

Pucci A, Gori M, Hagenbuchner M, Scarselli F, Tsoi AC. Investigation into the application of graph neural networks to large-scale recommender systems, infona.pl, no. 32, no 4, pp. 17–26, 2006.

Mahmud FB, Rayhan MM, Shuvo MH, Sadia I, Morol MK. A comparative analysis of Graph Neural Networks and commonly used machine learning algorithms on fake news detection, Proc. - 2022 7th Int. Conf. Data Sci. Mach. Learn. Appl. CDMA 2022, pp. 97–102, 2022.

Cui L, Seo H, Tabar M, Ma F, Wang S, Lee D, Deterrent: Knowledge Guided Graph Attention Network for Detecting Healthcare Misinformation, Proc. ACM SIGKDD Int. Conf. Knowl. Discov. Data Min., pp. 492–502, 2020.

Gori M, Monfardini G, Scarselli F, A new model for earning in raph domains, Proc. Int. Jt. Conf. Neural Networks, vol. 2, no. January 2005, pp. 729–734, 2005, https://doi.org/10.1109/IJCNN.2005.1555942 .

Scarselli F, Yong SL, Gori M, Hagenbuchner M, Tsoi AC, Maggini M. Graph neural networks for ranking web pages, Proc.—2005 IEEE/WIC/ACM Int. Web Intell. WI 2005, vol. 2005, no. January, pp. 666–672, 2005, doi: https://doi.org/10.1109/WI.2005.67 .

Gandhi S, Zyer AP, P3: Distributed deep graph learning at scale, Proc. 15th USENIX Symp. Oper. Syst. Des. Implementation, OSDI 2021, pp. 551–568, 2021.

Li C, Guo J, Zhang H. Pruning neighborhood graph for geodesic distance based semi-supervised classification, in 2007 International Conference on Computational Intelligence and Security (CIS 2007), 2007, pp. 428–432.

Zhang Z, Cui P, Pei J, Wang X, Zhu W, Eigen-gnn: A graph structure preserving plug-in for gnns, IEEE Trans. Knowl. Data Eng., 2021.

Nandedkar AV, Biswas PK. A granular reflex fuzzy min–max neural network for classification. IEEE Trans Neural Netw. 2009;20(7):1117–34.

Article   Google Scholar  

Chaturvedi DK, Premdayal SA, Chandiok A. Short-term load forecasting using soft computing techniques. Int’l J Commun Netw Syst Sci. 2010;3(03):273.

Google Scholar  

Hashem T, Kulik L, Zhang R. Privacy preserving group nearest neighbor queries, in Proceedings of the 13th International Conference on Extending Database Technology, 2010, pp. 489–500.

Sun Z et al. Knowledge graph alignment network with gated multi-hop neighborhood aggregation, in Proceedings of the AAAI Conference on Artificial Intelligence, 2020, vol. 34, no. 01, pp. 222–229.

Zhang M, Chen Y. Link prediction based on graph neural networks. Adv Neural Inf Process Syst. 31, 2018.

Stanimirović PS, Katsikis VN, Li S. Hybrid GNN-ZNN models for solving linear matrix equations. Neurocomputing. 2018;316:124–34.

Stanimirović PS, Petković MD. Gradient neural dynamics for solving matrix equations and their applications. Neurocomputing. 2018;306:200–12.

Zhang C, Song D, Huang C, Swami A, Chawla NV. Heterogeneous graph neural network, in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2019, pp. 793–803.

Fan W et al. Graph neural networks for social recommendation," in The world wide web conference, 2019, pp. 417–426.

Gui T et al. A lexicon-based graph neural network for Chinese NER," in Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 2019, pp. 1040–1050.

Qasim SR, Mahmood H, Shafait F. Rethinking table recognition using graph neural networks, in 2019 International Conference on Document Analysis and Recognition (ICDAR), 2019, pp. 142–147

You J, Ying R, Leskovec J. Position-aware graph neural networks, in International conference on machine learning, 2019, pp. 7134–7143.

Cao D, et al. Spectral temporal graph neural network for multivariate time-series forecasting. Adv Neural Inf Process Syst. 2020;33:17766–78.

Xhonneux LP, Qu M, Tang J. Continuous graph neural networks. In International Conference on Machine Learning, 2020, pp. 10432–10441.

Zhou K, Huang X, Li Y, Zha D, Chen R, Hu X. Towards deeper graph neural networks with differentiable group normalization. Adv Neural Inf Process Syst. 2020;33:4917–28.

Gu F, Chang H, Zhu W, Sojoudi S, El Ghaoui L. Implicit graph neural networks. Adv Neural Inf Process Syst. 2020;33:11984–95.

Liu Y, Guan R, Giunchiglia F, Liang Y, Feng X. Deep attention diffusion graph neural networks for text classification. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 2021, pp. 8142–8152.

Gasteiger J, Becker F, Günnemann S. Gemnet: universal directional graph neural networks for molecules. Adv Neural Inf Process Syst. 2021;34:6790–802.

Yao D et al. Deep hybrid: multi-graph neural network collaboration for hyperspectral image classification. Def. Technol. 2022.

Li Y, et al. Research on multi-port ship traffic prediction method based on spatiotemporal graph neural networks. J Mar Sci Eng. 2023;11(7):1379.

Djenouri Y, Belhadi A, Srivastava G, Lin JC-W. Hybrid graph convolution neural network and branch-and-bound optimization for traffic flow forecasting. Futur Gener Comput Syst. 2023;139:100–8.

Zhou J, et al. Graph neural networks: a review of methods and applications. AI Open. 2020;1(January):57–81. https://doi.org/10.1016/j.aiopen.2021.01.001 .

Rong Y, Huang W, Xu T, Huang J. Dropedge: Towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903. 2019.

Abu-Salih B, Al-Qurishi M, Alweshah M, Al-Smadi M, Alfayez R, Saadeh H. Healthcare knowledge graph construction: a systematic review of the state-of-the-art, open issues, and opportunities. J Big Data. 2023;10(1):81.

Kipf TN, Welling M. Semi-supervised classification with graph convolutional networks. arXiv Prepr. arXiv1609.02907, 2016.

Berg RV, Kipf TN, Welling M. Graph Convolutional Matrix Completion. 2017, http://arxiv.org/abs/1706.02263

Monti F, Boscaini D, Masci J, Rodola E, Svoboda J, Bronstein MM. Geometric deep learning on graphs and manifolds using mixture model cnns. InProceedings of the IEEE conference on computer vision and pattern recognition 2017 (pp. 5115-5124).

Cui Z, Henrickson K, Ke R, Wang Y. Traffic graph convolutional recurrent neural network: a deep learning framework for network-scale traffic learning and forecasting. IEEE Trans Intell Transp Syst. 2020;21(11):4883–94. https://doi.org/10.1109/TITS.2019.2950416 .

Yang J, Lu J, Lee S, Batra D, Parikh D. Graph r-cnn for scene graph generation. InProceedings of the European conference on computer vision (ECCV) 2018 (pp. 670-685). https://doi.org/10.1007/978-3-030-01246-5_41 .

Teney D, Liu L, van Den Hengel A. Graph-structured representations for visual question answering. InProceedings of the IEEE conference on computer vision and pattern recognition 2017 (pp. 1-9). https://doi.org/10.1109/CVPR.2017.344 .

Yao L, Mao C, Luo Y. Graph convolutional networks for text classification. Proc AAAI Conf Artif Intell. 2019;33(01):7370–7.

De Cao N, Aziz W, Titov I. Question answering by reasoning across documents with graph convolutional networks. arXiv Prepr. arXiv1808.09920, 2018.

Gao H, Wang Z, Ji S. Large-scale learnable graph convolutional networks. in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 2018, pp. 1416–1424.

Hu F, Zhu Y, Wu S, Wang L, Tan T. Hierarchical graph convolutional networks for semi-supervised node classification. arXiv Prepr. arXiv1902.06667, 2019.

Lange O, Perez L. Traffic prediction with advanced graph neural networks. DeepMind Research Blog Post, https://deepmind.google/discover/blog/traffic-prediction-with-advanced-graph-neural-networks/ . 2020.

Duan C, Hu B, Liu W, Song J. Motion capture for sporting events based on graph convolutional neural networks and single target pose estimation algorithms. Appl Sci. 2023;13(13):7611.

Balcıoğlu YS, Sezen B, Çerasi CC, Huang SH. machine design automation model for metal production defect recognition with deep graph convolutional neural network. Electronics. 2023;12(4):825.

Baghbani A, Bouguila N, Patterson Z. Short-term passenger flow prediction using a bus network graph convolutional long short-term memory neural network model. Transp Res Rec. 2023;2677(2):1331–40.

Velickovic P, Cucurull G, Casanova A, Romero A, Lio P, Bengio Y. Graph attention networks. Stat. 2017;1050(20):10–48550.

Hamilton W, Ying Z, Leskovec J. Inductive representation learning on large graphs. Advances in neural information processing systems. 2017; 30.

Ye Y, Ji S. Sparse graph attention networks. IEEE Trans Knowl Data Eng. 2021;35(1):905–16.

MathSciNet   Google Scholar  

Chen Z et al. Graph neural network-based fault diagnosis: a review. arXiv Prepr. arXiv2111.08185, 2021.

Brody S, Alon U, Yahav E. How attentive are graph attention networks? arXiv Prepr. arXiv2105.14491, 2021.

Huang J, Shen H, Hou L, Cheng X. Signed graph attention networks," in International Conference on Artificial Neural Networks. 2019, pp. 566–577.

Seraj E, Wang Z, Paleja R, Sklar M, Patel A, Gombolay M. Heterogeneous graph attention networks for learning diverse communication. arXiv preprint arXiv: 2108.09568. 2021.

Zhang Y, Wang X, Shi C, Jiang X, Ye Y. Hyperbolic graph attention network. IEEE Transactions on Big Data. 2021;8(6):1690–701.

Yang X, Ma H, Wang M. Research on rumor detection based on a graph attention network with temporal features. Int J Data Warehous Min. 2023;19(2):1–17.

Lan W, et al. KGANCDA: predicting circRNA-disease associations based on knowledge graph attention network. Brief Bioinform. 2022;23(1):bbab494.

Xiao L, Wu X, Wang G, 2019, December. Social network analysis based on graph SAGE. In 2019 12th international symposium on computational intelligence and design (ISCID) (Vol. 2, pp. 196–199). IEEE.

Chang L, Branco P. Graph-based solutions with residuals for intrusion detection: The modified e-graphsage and e-resgat algorithms. arXiv preprint arXiv:2111.13597. 2021.

Oh J, Cho K, Bruna J. Advancing graphsage with a data-driven node sampling. arXiv preprint arXiv:1904.12935. 2019.

Kapoor M, Patra S, Subudhi BN, Jakhetiya V, Bansal A. Underwater Moving Object Detection Using an End-to-End Encoder-Decoder Architecture and GraphSage With Aggregator and Refactoring. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 5635-5644).

Bhatti UA, Tang H, Wu G, Marjan S, Hussain A. Deep learning with graph convolutional networks: An overview and latest applications in computational intelligence. Int J Intell Syst. 2023;2023:1–28.

David L, Thakkar A, Mercado R, Engkvist O. Molecular representations in AI-driven drug discovery: a review and practical guide. J Cheminform. 2020;12(1):1–22.

Davies A, Ajmeri N. Realistic Synthetic Social Networks with Graph Neural Networks. arXiv preprint arXiv:2212.07843. 2022; 15.

Frank MR, Wang D, Cebrian M, Rahwan I. The evolution of citation graphs in artificial intelligence research. Nat Mach Intell. 2019;1(2):79–85.

Gao C, Wang X, He X, Li Y. Graph neural networks for recommender system. InProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining 2022 (pp. 1623-1625).

Wu S, Sun F, Zhang W, Xie X, Cui B. Graph neural networks in recommender systems: a survey. ACM Comput Surv. 2022;55(5):1–37.

Wu L, Chen Y, Shen K, Guo X, Gao H, Li S, Pei J, Long B. Graph neural networks for natural language processing: a survey. Found Trends Mach Learn. 2023;16(2):119–328.

Wu L, Chen Y, Ji H, Liu B. Deep learning on graphs for natural language processing. InProceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval 2021 (pp. 2651-2653).

Liu X, Su Y, Xu B. The application of graph neural network in natural language processing and computer vision. In2021 3rd International Conference on Machine Learning, Big Data and Business Intelligence (MLBDBI) 2021 (pp. 708-714).

Harmon SHE, Faour DE, MacDonald NE. Mandatory immunization and vaccine injury support programs: a survey of 28 GNN countries. Vaccine. 2021;39(49):7153–7.

Yan W, Zhang Z, Zhang Q, Zhang G, Hua Q, Li Q. Deep data analysis-based agricultural products management for smart public healthcare. Front Public Health. 2022;10:847252.

Hamaguchi T, Oiwa H, Shimbo M, Matsumoto Y. Knowledge transfer for out-of-knowledge-base entities: a graph neural network approach. arXiv preprint arXiv:1706.05674. 2017.

Dai D, Zheng H, Luo F, Yang P, Chang B, Sui Z. Inductively representing out-of-knowledge-graph entities by optimal estimation under translational assumptions. arXiv preprint arXiv:2009.12765.

Pradhyumna P, Shreya GP. Graph neural network (GNN) in image and video understanding using deep learning for computer vision applications. In2021 Second International Conference on Electronics and Sustainable Communication Systems (ICESC) 2021 (pp. 1183-1189).

Shi W, Rajkumar R. Point-gnn: Graph neural network for 3d object detection in a point cloud. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2020 (pp. 1711-1719).

Wu Y, Dai HN, Tang H. Graph neural networks for anomaly detection in industrial internet of things. IEEE Int Things J. 2021;9(12):9214–31.

Pitsik EN, et al. The topology of fMRI-based networks defines the performance of a graph neural network for the classification of patients with major depressive disorder. Chaos Solitons Fractals. 2023;167: 113041.

Liao W, Zeng B, Liu J, Wei P, Cheng X, Zhang W. Multi-level graph neural network for text sentiment analysis. Comput Electr Eng. 2021;92: 107096.

Kumar VS, Alemran A, Karras DA, Gupta SK, Dixit CK, Haralayya B. Natural Language Processing using Graph Neural Network for Text Classification. In2022 International Conference on Knowledge Engineering and Communication Systems (ICKES) 2022; (pp. 1-5).

Dara S, Srinivasulu CH, Babu CM, Ravuri A, Paruchuri T, Kilak AS, Vidyarthi A. Context-Aware auto-encoded graph neural model for dynamic question generation using NLP. ACM transactions on asian and low-resource language information processing. 2023.

Wu L, Cui P, Pei J, Zhao L, Guo X. Graph neural networks: foundation, frontiers and applications. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining 2022; (pp. 4840-4841).

Scarselli F, Gori M, Tsoi AC, Hagenbuchner M, Monfardini G. The graph neural network model. IEEE Trans neural networks. 2008;20(1):61–80.

Cao P, Zhu Z, Wang Z, Zhu Y, Niu Q. Applications of graph convolutional networks in computer vision. Neural Comput Appl. 2022;34(16):13387–405.

You R, Yao S, Mamitsuka H, Zhu S. DeepGraphGO: graph neural network for large-scale, multispecies protein function prediction. Bioinformatics. 2021;37(Supplement_1):i262-71.

Long Y, et al. Pre-training graph neural networks for link prediction in biomedical networks. Bioinformatics. 2022;38(8):2254–62.

Wu Y, Gao M, Zeng M, Zhang J, Li M. BridgeDPI: a novel graph neural network for predicting drug–protein interactions. Bioinformatics. 2022;38(9):2571–8.

Kang C, Zhang H, Liu Z, Huang S, Yin Y. LR-GNN: a graph neural network based on link representation for predicting molecular associations. Briefings Bioinf. 2022;23(1):bbab513.

Wei X, Huang H, Ma L, Yang Z, Xu L. Recurrent Graph Neural Networks for Text Classification. in 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS), 2020, pp. 91–97.

Schlichtkrull MS, De Cao N, Titov I. Interpreting graph neural networks for nlp with differentiable edge masking. arXiv Prepr. arXiv2010.00577, 2020.

Tu M, Huang J, He X, Zhou B. Graph sequential network for reasoning over sequences. arXiv Prepr. arXiv2004.02001, 2020.

Download references

Acknowledgements

I am grateful to all of those with whom I have had the pleasure to work during this research work. Each member has provided me extensive personal and professional guidance and taught me a great deal about scientific research and life in general.

This work was supported by the Research Support Fund (RSF) of Symbiosis International (Deemed University), Pune, India.

Author information

Authors and affiliations.

Symbiosis Institute of Technology Pune Campus, Symbiosis International (Deemed University) (SIU), Lavale, Pune, 412115, India

Bharti Khemani

Symbiosis Centre for Applied Artificial Intelligence (SCAAI), Symbiosis Institute of Technology Pune Campus, Symbiosis International (Deemed University) (SIU), Lavale, Pune, 412115, India

Shruti Patil & Ketan Kotecha

IEEE, Department of Computer Science and Engineering, Institute of Technology, Nirma University, Ahmedabad, India

Sudeep Tanwar

You can also search for this author in PubMed   Google Scholar

Contributions

Conceptualization, BK and SP; methodology, BK and SP; software, BK; validation, BK, SP, KK; formal analysis, BK; investigation, BK; resources, BK; data curation, BK and SP; writing—original draft preparation, BK; writing—review and editing, SP, KK, and ST; visualization, BK; supervision, SP; project administration, SP, ST; funding acquisition, KK.

Corresponding author

Correspondence to Shruti Patil .

Ethics declarations

Ethics approval and consent to participate.

Not applicable

Consent for publication

Competing interests.

The authors declare that they have no competing interests .

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

See Tables  12 and 13

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ .

Reprints and permissions

About this article

Cite this article.

Khemani, B., Patil, S., Kotecha, K. et al. A review of graph neural networks: concepts, architectures, techniques, challenges, datasets, applications, and future directions. J Big Data 11 , 18 (2024). https://doi.org/10.1186/s40537-023-00876-4

Download citation

Received : 28 June 2023

Accepted : 27 December 2023

Published : 16 January 2024

DOI : https://doi.org/10.1186/s40537-023-00876-4

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Graph Neural Network (GNN)
  • Graph Convolution Network (GCN)
  • Graph Attention Networks (GAT)
  • Message Passing Mechanism
  • Natural Language Processing (NLP)

research topics on neural network

  • Frontiers in Neurorobotics
  • Research Topics

Neural Network Models in Autonomous Robotics

Total Downloads

Total Views and Downloads

About this Research Topic

The integration of neural network models in autonomous robotics represents a monumental leap in artificial intelligence and robotics. These models, mirroring the human brain's complexity and efficiency, have catalyzed innovations in machine learning, fostering more adaptive, intelligent, and efficient robotic ...

Keywords : Neural Network Models, Autonomous Robotics, Energy Efficiency, Multi-modal Sensory Data, Human-Robot Collaboration

Important Note : All contributions to this Research Topic must be within the scope of the section and journal to which they are submitted, as defined in their mission statements. Frontiers reserves the right to guide an out-of-scope manuscript to a more suitable section or journal at any stage of peer review.

Topic Editors

Topic coordinators, recent articles, submission deadlines.

Manuscript

Participating Journals

Manuscripts can be submitted to this Research Topic via the following journals:

total views

  • Demographics

No records found

total views article views downloads topic views

Top countries

Top referring sites, about frontiers research topics.

With their unique mixes of varied contributions from Original Research to Review Articles, Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author.

Suggestions or feedback?

MIT News | Massachusetts Institute of Technology

  • Machine learning
  • Sustainability
  • Black holes
  • Classes and programs

Departments

  • Aeronautics and Astronautics
  • Brain and Cognitive Sciences
  • Architecture
  • Political Science
  • Mechanical Engineering

Centers, Labs, & Programs

  • Abdul Latif Jameel Poverty Action Lab (J-PAL)
  • Picower Institute for Learning and Memory
  • Lincoln Laboratory
  • School of Architecture + Planning
  • School of Engineering
  • School of Humanities, Arts, and Social Sciences
  • Sloan School of Management
  • School of Science
  • MIT Schwarzman College of Computing

These neural networks know what they’re doing

Press contact :, media download.

autonomous smart vehicles on bridge

*Terms of Use:

Images for download on the MIT News office website are made available to non-commercial entities, press and the general public under a Creative Commons Attribution Non-Commercial No Derivatives license . You may not alter the images provided, other than to crop them to size. A credit line must be used when reproducing images; if one is not provided below, credit the images to "MIT."

autonomous smart vehicles on bridge

Previous image Next image

Neural networks can learn to solve all sorts of problems, from identifying cats in photographs to steering a self-driving car. But whether these powerful, pattern-recognizing algorithms actually understand the tasks they are performing remains an open question.

For example, a neural network tasked with keeping a self-driving car in its lane might learn to do so by watching the bushes at the side of the road, rather than learning to detect the lanes and focus on the road’s horizon.

Researchers at MIT have now shown that a certain type of neural network is able to learn the true cause-and-effect structure of the navigation task it is being trained to perform. Because these networks can understand the task directly from visual data, they should be more effective than other neural networks when navigating in a complex environment, like a location with dense trees or rapidly changing weather conditions.

In the future, this work could improve the reliability and trustworthiness of machine learning agents that are performing high-stakes tasks, like driving an autonomous vehicle on a busy highway.

“Because these machine-learning systems are able to perform reasoning in a causal way, we can know and point out how they function and make decisions. This is essential for safety-critical applications,” says co-lead author Ramin Hasani, a postdoc in the Computer Science and Artificial Intelligence Laboratory (CSAIL).

Co-authors include electrical engineering and computer science graduate student and co-lead author Charles Vorbach; CSAIL PhD student Alexander Amini; Institute of Science and Technology Austria graduate student Mathias Lechner; and senior author Daniela Rus, the Andrew and Erna Viterbi Professor of Electrical Engineering and Computer Science and director of CSAIL. The research will be presented at the 2021 Conference on Neural Information Processing Systems (NeurIPS) in December.

An attention-grabbing result

Neural networks are a method for doing machine learning in which the computer learns to complete a task through trial-and-error by analyzing many training examples. And “ liquid” neural networks change their underlying equations to continuously adapt to new inputs.

The new research draws on previous work in which Hasani and others showed how a brain-inspired type of deep learning system called a Neural Circuit Policy (NCP), built by liquid neural network cells, is able to autonomously control a self-driving vehicle, with a network of only 19 control neurons.  

The researchers observed that the NCPs performing a lane-keeping task kept their attention on the road’s horizon and borders when making a driving decision, the same way a human would (or should) while driving a car. Other neural networks they studied didn’t always focus on the road.

“That was a cool observation, but we didn’t quantify it. So, we wanted to find the mathematical principles of why and how these networks are able to capture the true causation of the data,” he says.

They found that, when an NCP is being trained to complete a task, the network learns to interact with the environment and account for interventions. In essence, the network recognizes if its output is being changed by a certain intervention, and then relates the cause and effect together.  

During training, the network is run forward to generate an output, and then backward to correct for errors. The researchers observed that NCPs relate cause-and-effect during forward-mode and backward-mode, which enables the network to place very focused attention on the true causal structure of a task.

Hasani and his colleagues didn’t need to impose any additional constraints on the system or perform any special set up for the NCP to learn this causality.

“Causality is especially important to characterize for safety-critical applications such as flight,” says Rus. “Our work demonstrates the causality properties of Neural Circuit Policies for decision-making in flight, including flying in environments with dense obstacles such as forests and flying in formation.”

Weathering environmental changes

They tested NCPs through a series of simulations in which autonomous drones performed navigation tasks. Each drone used inputs from a single camera to navigate.

The drones were tasked with traveling to a target object, chasing a moving target, or following a series of markers in varied environments, including a redwood forest and a neighborhood. They also traveled under different weather conditions, like clear skies, heavy rain, and fog.

The researchers found that the NCPs performed as well as the other networks on simpler tasks in good weather, but outperformed them all on the more challenging tasks, such as chasing a moving object through a rainstorm.

“We observed that NCPs are the only network that pay attention to the object of interest in different environments while completing the navigation task, wherever you test it, and in different lighting or environmental conditions. This is the only system that can do this casually and actually learn the behavior we intend the system to learn,” he says.

Their results show that the use of NCPs could also enable autonomous drones to navigate successfully in environments with changing conditions, like a sunny landscape that suddenly becomes foggy.

“Once the system learns what it is actually supposed to do, it can perform well in novel scenarios and environmental conditions it has never experienced. This is a big challenge of current machine learning systems that are not causal. We believe these results are very exciting, as they show how causality can emerge from the choice of a neural network,” he says.

In the future, the researchers want to explore the use of NCPs to build larger systems. Putting thousands or millions of networks together could enable them to tackle even more complicated tasks.

This research was supported by the United States Air Force Research Laboratory, the United States Air Force Artificial Intelligence Accelerator, and the Boeing Company.

Share this news article on:

Related links.

  • Ramin Hasani
  • Daniela Rus
  • Computer Science and Artificial Intelligence Laboratory
  • Department of Electrical Engineering and Computer Science

Related Topics

  • Electrical Engineering & Computer Science (eecs)
  • Computer Science and Artificial Intelligence Laboratory (CSAIL)
  • Computer science and technology
  • Autonomous vehicles

Related Articles

A photo of four people standing outside the Stata Center, wearing facemasks

More transparency and understanding into machine behaviors

liquid network graphic

“Liquid” machine-learning system adapts to changing conditions

neural networks

A neural network learns when it should not be trusted

A simulation system invented at MIT to train driverless cars creates a photorealistic world with infinite steering possibilities, helping the cars learn to navigate a host of worse-case scenarios before cruising down real streets.

System trains driverless cars in simulation before they hit the road

research topics on neural network

Explained: Neural networks

Previous item Next item

More MIT News

A small model shows a wooden man in a sparse room, with dramatic lighting from the windows.

Students learn theater design through the power of play

Read full story →

Illustration of 5 spheres with purple and brown swirls. Below that, a white koala with insets showing just its head. Each koala has one purple point on either the forehead, ears, and nose.

A framework for solving parabolic partial differential equations

Feyisayo Eweje wears lab coat and gloves while sitting in a lab.

Designing better delivery for medical therapies

Saeed Miganeh poses standing in a hallway. A street scene is visible through windows in the background

Making a measurable economic impact

Jessica Tam headshot

Faces of MIT: Jessica Tam

More than a dozen people sit around shared tables with laptops running App Inventor

First AI + Education Summit is an international push for “AI fluency”

  • More news on MIT News homepage →

Massachusetts Institute of Technology 77 Massachusetts Avenue, Cambridge, MA, USA

  • Map (opens in new window)
  • Events (opens in new window)
  • People (opens in new window)
  • Careers (opens in new window)
  • Accessibility
  • Social Media Hub
  • MIT on Facebook
  • MIT on YouTube
  • MIT on Instagram

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Correspondence
  • Published: 21 October 2015

Neural networks in the future of neuroscience research

  • Mikail Rubinov 1 , 2  

Nature Reviews Neuroscience volume  16 ,  page 767 ( 2015 ) Cite this article

7305 Accesses

8 Citations

11 Altmetric

Metrics details

  • Network models

Neural networks are increasingly seen to supersede neurons as fundamental units of complex brain function. In his Timeline article (From the neuron doctrine to neural networks. Nat. Rev. Neurosci. 16 , 487–497 (2015)) 1 , Yuste provides a timely overview of this process, but does not clearly differentiate between biological neural network models (broadly and imprecisely defined as empirically valid models of (embodied) neuronal or brain systems, which enable the emergence of complex brain function through distributed computation) and artificial neural network models (a relatively well-defined class of networks originally designed to model complex brain function 2 but now mainly viewed as a class of biologically inspired data-analysis algorithms useful in diverse scientific fields 3 ).

A distinction between biological and artificial neural network models is important as the neuroscience network paradigm is mainly driven by the aim of uncovering biologically valid mechanisms of neural computation. Artificial neural networks were initially proposed as candidate models for such computation but, despite being enthusiastically researched at the end of the twentieth century, they have largely not bridged the gap between elegant theory and neuroscientific observation 4 , 5 . In this context, Yuste's emphasis on some classic artificial neural network models does not seem to be supported by the evidence of, or the promise for, the problem-solving capacity of these models in neuroscience 6 .

What could be an alternative promising approach to biologically valid neural network modelling? At present we can only speculate, but the ongoing development of high-resolution high-throughput brain imaging technologies — including those being developed as part of the BRAIN Initiative 7 — and the consequent availability of increasingly large structural 8 and functional 9 imaging data sets, make it appealing to initially search for patterns in such data in less theory-bound and more data-driven ways 10 , 11 , and to subsequently construct theories a priori constrained on these discovered patterns 12 . A famous example of this approach in biology is the formulation of the theory of evolution by natural selection; this theory arose from an initial aim to catalogue all living biological organisms on earth, and from a subsequent careful analysis of the obtained diverse biological data 13 . Interestingly, artificial neural networks may yet prove to be important in this quest but in the role of powerful tools for analysing complex imaging data sets 14 , rather than as a theoretical foundation for how the brain computes.

Yuste, R. From the neuron doctrine to neural networks. Nat. Rev. Neurosci. 16 , 487–497 (2015).

Article   CAS   Google Scholar  

Rumelhart, D. E., McClelland, J. L. & The PDP Research Group. Parallel Distributed Processing: Explorations in the Microstructure of Cognition (MIT Press, 1986).

Google Scholar  

LeCun, Y., Bengio, Y. & Hinton, G. Deep learning. Nature 521 , 436–444 (2015).

Marcus, G. in The Future of the Brain: Essays by the World's Leading Neuroscientists (eds Marcus, G. & Freeman, J.) 205–215 (Princeton Univ. Press, 2014).

Book   Google Scholar  

Zador, A. in The Future of the Brain: Essays by the World's Leading Neuroscientists (eds Marcus, G. & Freeman, J.) 40–49 (Princeton Univ. Press, 2014).

Laudan, L. Progress and Its Problems: Towards a Theory of Scientific Growth (University of California Press, 1978).

Alivisatos, A. P. et al. Nanotools for neuroscience and brain activity mapping. ACS Nano 7 , 1850–1866 (2013).

Oh, S. W. et al. A mesoscale connectome of the mouse brain. Nature 508 , 207–214 (2014).

Ahrens, M. B. et al. Brain-wide neuronal dynamics during motor adaptation in zebrafish. Nature 485 , 471–477 (2012).

Sporns, O. Discovering the Human Connectome (MIT Press, 2012).

Vogelstein, J. T. et al. Discovery of brainwide neural-behavioral maps via multiscale unsupervised structure learning. Science 344 , 386–392 (2014).

Sejnowski, T. J., Churchland, P. S. & Movshon, J. A. Putting big data to good use in neuroscience. Nat. Neurosci. 17 , 1440–1441 (2014).

Kell, D. B. & Oliver, S. G. Here is the evidence, now what is the hypothesis? The complementary roles of inductive and hypothesis-driven science in the post-genomic era. BioEssays 26 , 99–105 (2004).

Article   Google Scholar  

Helmstaedter, M. et al. Connectomic reconstruction of the inner plexiform layer in the mouse retina. Nature 500 , 168–174 (2013).

Download references

Acknowledgements

The author thanks C. Chang for helpful comments. The author has received funding from the NARSAD Young Investigator Award, the Isaac Newton Trust Research Grant and the Parke Davis Exchange Fellowship.

Author information

Authors and affiliations.

Mikail Rubinov is at the Department of Psychiatry and Churchill College, University of Cambridge, Cambridge CB3 0DS, UK; and the Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia 20147, USA.,

Mikail Rubinov

the Janelia Research Campus, Howard Hughes Medical Institute, Ashburn, Virginia 20147, USA.,

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Mikail Rubinov .

Ethics declarations

Competing interests.

The author declares no competing financial interests.

Rights and permissions

Reprints and permissions

About this article

Cite this article.

Rubinov, M. Neural networks in the future of neuroscience research. Nat Rev Neurosci 16 , 767 (2015). https://doi.org/10.1038/nrn4042

Download citation

Published : 21 October 2015

Issue Date : December 2015

DOI : https://doi.org/10.1038/nrn4042

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

This article is cited by

Application of mlp-ann models for estimating the higher heating value of bamboo biomass.

  • Satyajit Pattanayak
  • Chanchal Loha
  • Lalsangzela Sailo

Biomass Conversion and Biorefinery (2021)

A design and development of support system for prediction of various renal syndromes using artificial neural networks

  • Gollapalli Sumana
  • K. Kalaiselvi
  • M. Kezia Joseph

International Journal of System Assurance Engineering and Management (2021)

On testing neural network models

  • Rafael Yuste

Nature Reviews Neuroscience (2015)

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

research topics on neural network

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

information-logo

Article Menu

research topics on neural network

  • Subscribe SciFeed
  • Recommended Articles
  • Google Scholar
  • on Google Scholar
  • Table of Contents

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

JSmol Viewer

Recurrent neural networks: a comprehensive review of architectures, variants, and applications.

research topics on neural network

1. Introduction

2. related works, 3. fundamentals of rnns, 3.1. basic architecture and working principle of standard rnns, 3.2. activation functions, 3.3. the vanishing and exploding gradient problems, 3.4. bidirectional rnns, 3.5. deep rnns, 4. advanced variants of rnns, 4.1. long short-term memory networks, 4.2. bidirectional lstm, stacked lstm, 4.3. gated recurrent units, comparison with lstm, 4.4. other notable variants, 4.4.1. peephole lstm, 4.4.2. echo state networks.

  • Deep Echo-State Networks: Recent research has extended the ESN architecture to deeper variants, known as deep echo-state networks (DeepESNs). In DeepESNs, multiple reservoir layers are stacked, allowing the network to capture hierarchical temporal features across different timescales [ 87 ]. Each layer in a DeepESN processes the output from the previous layer’s reservoir, enabling the model to learn more abstract and complex representations of the input data. The state update for a DeepESN can be generalized as follows: h t l = tanh ( W i n l h t l − 1 + W r e s l h t − 1 l ) , (31) where l denotes the layer number, h t l is the hidden state at layer l , W i n l is the input weight matrix for layer l , and h t l − 1 is the hidden state from the previous layer. DeepESNs have demonstrated improved performance in tasks requiring the modeling of complex temporal patterns, such as speech recognition and financial time series forecasting [ 88 ].
  • Ensemble Deep ESNs: In ensemble deep ESNs, multiple DeepESNs are trained independently, and their outputs are combined to form the final prediction [ 89 ]. This ensemble approach leverages the diversity of the reservoirs and the deep architecture to improve robustness and accuracy, particularly in time series forecasting applications. For instance, Gao et al. [ 90 ] demonstrated the effectiveness of Deep ESN ensembles in predicting significant wave heights, where the ensemble approach helped mitigate the impact of reservoir initialization variability and improved the model’s generalization ability.
  • Input Processing with Signal Decomposition: Another critical aspect of effectively utilizing RNNs and ESNs is the preprocessing of input signals. Given the complex and often noisy nature of real-world time series data, signal decomposition techniques such as the empirical wavelet transform (EWT) have been employed to enhance the input to ESNs [ 91 ]. The EWT decomposes the input signal into different frequency components, allowing the ESN to process each component separately and improve the model’s ability to capture underlying patterns. The combination of the EWT with ESNs has shown promising results in various applications, including time series forecasting, where it helps reduce noise and enhance the predictive performance of the model.

4.4.3. Independently Recurrent Neural Network

5. innovations in rnn architectures and training methodologies, 5.1. hybrid architectures, 5.2. neural architecture search, 5.3. advanced optimization techniques, 5.4. rnns with attention mechanisms, 5.5. rnns integrated with transformer models, 6. public datasets for rnn research, 7. applications of rnns in peer-reviewed literature, 7.1. natural language processing, 7.1.1. text generation, 7.1.2. sentiment analysis, 7.1.3. machine translation, 7.2. speech recognition, 7.3. time series forecasting, 7.4. signal processing, 7.5. bioinformatics, 7.6. autonomous vehicles, 7.7. anomaly detection, 8. challenges and future research directions, 8.1. scalability and efficiency, 8.2. interpretability and explainability, 8.3. bias and fairness, 8.4. data dependency and quality, 8.5. overfitting and generalization, 9. conclusions, author contributions, institutional review board statement, informed consent statement, data availability statement, conflicts of interest, abbreviations.

AIArtificial intelligence
ANNArtificial neural network
BiLSTMBidirectional long short-term memory
CNNConvolutional neural network
DLDeep learning
GRUGated recurrent unit
LSTMLong short-term memory
MLMachine learning
NASNeural architecture search
NLPNatural language processing
RNNRecurrent neural network
RLReinforcement learning
SHAPsShapley Additive Explanations
TPUTensor processing unit
VAEVariational autoencoder
  • O’Halloran, T.; Obaido, G.; Otegbade, B.; Mienye, I.D. A deep learning approach for Maize Lethal Necrosis and Maize Streak Virus disease detection. Mach. Learn. Appl. 2024 , 16 , 100556. [ Google Scholar ] [ CrossRef ]
  • Peng, Y.; He, L.; Hu, D.; Liu, Y.; Yang, L.; Shang, S. Decoupling Deep Learning for Enhanced Image Recognition Interpretability. ACM Trans. Multimed. Comput. Commun. Appl. 2024 . [ Google Scholar ] [ CrossRef ]
  • Khan, W.; Daud, A.; Khan, K.; Muhammad, S.; Haq, R. Exploring the frontiers of deep learning and natural language processing: A comprehensive overview of key challenges and emerging trends. Nat. Lang. Process. J. 2023 , 4 , 100026. [ Google Scholar ] [ CrossRef ]
  • Obaido, G.; Achilonu, O.; Ogbuokiri, B.; Amadi, C.S.; Habeebullahi, L.; Ohalloran, T.; Chukwu, C.W.; Mienye, E.; Aliyu, M.; Fasawe, O.; et al. An Improved Framework for Detecting Thyroid Disease Using Filter-Based Feature Selection and Stacking Ensemble. IEEE Access 2024 , 12 , 89098–89112. [ Google Scholar ] [ CrossRef ]
  • Mienye, I.D.; Obaido, G.; Aruleba, K.; Dada, O.A. Enhanced Prediction of Chronic Kidney Disease using Feature Selection and Boosted Classifiers. In Proceedings of the International Conference on Intelligent Systems Design and Applications, Virtual, 13–15 December 2021; pp. 527–537. [ Google Scholar ]
  • Al-Jumaili, A.H.A.; Muniyandi, R.C.; Hasan, M.K.; Paw, J.K.S.; Singh, M.J. Big data analytics using cloud computing based frameworks for power management systems: Status, constraints, and future recommendations. Sensors 2023 , 23 , 2952. [ Google Scholar ] [ CrossRef ]
  • Gill, S.S.; Wu, H.; Patros, P.; Ottaviani, C.; Arora, P.; Pujol, V.C.; Haunschild, D.; Parlikad, A.K.; Cetinkaya, O.; Lutfiyya, H.; et al. Modern computing: Vision and challenges. Telemat. Inform. Rep. 2024 , 13 , 100116. [ Google Scholar ] [ CrossRef ]
  • Mienye, I.D.; Jere, N. A Survey of Decision Trees: Concepts, Algorithms, and Applications. IEEE Access 2024 , 12 , 86716–86727. [ Google Scholar ] [ CrossRef ]
  • Aruleba, R.T.; Adekiya, T.A.; Ayawei, N.; Obaido, G.; Aruleba, K.; Mienye, I.D.; Aruleba, I.; Ogbuokiri, B. COVID-19 diagnosis: A review of rapid antigen, RT-PCR and artificial intelligence methods. Bioengineering 2022 , 9 , 153. [ Google Scholar ] [ CrossRef ]
  • Alhajeri, M.S.; Ren, Y.M.; Ou, F.; Abdullah, F.; Christofides, P.D. Model predictive control of nonlinear processes using transfer learning-based recurrent neural networks. Chem. Eng. Res. Des. 2024 , 205 , 1–12. [ Google Scholar ] [ CrossRef ]
  • Shahinzadeh, H.; Mahmoudi, A.; Asilian, A.; Sadrarhami, H.; Hemmati, M.; Saberi, Y. Deep Learning: A Overview of Theory and Architectures. In Proceedings of the 2024 20th CSI International Symposium on Artificial Intelligence and Signal Processing (AISP), Babol, Iran, 21–22 February 2024; pp. 1–11. [ Google Scholar ]
  • Baruah, R.D.; Organero, M.M. Explicit Context Integrated Recurrent Neural Network for applications in smart environments. Expert Syst. Appl. 2024 , 255 , 124752. [ Google Scholar ] [ CrossRef ]
  • Werbos, P. Backpropagation through time: What it does and how to do it. Proc. IEEE 1990 , 78 , 1550–1560. [ Google Scholar ] [ CrossRef ]
  • Lalapura, V.S.; Amudha, J.; Satheesh, H.S. Recurrent neural networks for edge intelligence: A survey. ACM Comput. Surv. (CSUR) 2021 , 54 , 1–38. [ Google Scholar ] [ CrossRef ]
  • Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997 , 9 , 1735–1780. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Cho, K.; Van Merriënboer, B.; Gulcehre, C.; Bahdanau, D.; Bougares, F.; Schwenk, H.; Bengio, Y. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv 2014 , arXiv:1406.1078. [ Google Scholar ]
  • Liu, F.; Li, J.; Wang, L. PI-LSTM: Physics-informed long short-term memory network for structural response modeling. Eng. Struct. 2023 , 292 , 116500. [ Google Scholar ] [ CrossRef ]
  • Ni, Q.; Ji, J.; Feng, K.; Zhang, Y.; Lin, D.; Zheng, J. Data-driven bearing health management using a novel multi-scale fused feature and gated recurrent unit. Reliab. Eng. Syst. Saf. 2024 , 242 , 109753. [ Google Scholar ] [ CrossRef ]
  • Niu, Z.; Zhong, G.; Yue, G.; Wang, L.N.; Yu, H.; Ling, X.; Dong, J. Recurrent attention unit: A new gated recurrent unit for long-term memory of important parts in sequential data. Neurocomputing 2023 , 517 , 1–9. [ Google Scholar ] [ CrossRef ]
  • Lipton, Z.C.; Berkowitz, J.; Elkan, C. A critical review of recurrent neural networks for sequence learning. arXiv 2015 , arXiv:1506.00019. [ Google Scholar ]
  • Yu, Y.; Si, X.; Hu, C.; Zhang, J. A review of recurrent neural networks: LSTM cells and network architectures. Neural Comput. 2019 , 31 , 1235–1270. [ Google Scholar ] [ CrossRef ]
  • Tarwani, K.M.; Edem, S. Survey on recurrent neural network in natural language processing. Int. J. Eng. Trends Technol. 2017 , 48 , 301–304. [ Google Scholar ] [ CrossRef ]
  • Tsoi, A.C.; Back, A.D. Locally recurrent globally feedforward networks: A critical review of architectures. IEEE Trans. Neural Netw. 1994 , 5 , 229–239. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Mastorocostas, P.A.; Theocharis, J.B. A stable learning algorithm for block-diagonal recurrent neural networks: Application to the analysis of lung sounds. IEEE Trans. Syst. Man. Cybern. Part B (Cybern.) 2006 , 36 , 242–254. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Dutta, K.K.; Poornima, S.; Sharma, R.; Nair, D.; Ploeger, P.G. Applications of Recurrent Neural Network: Overview and Case Studies. In Recurrent Neural Networks ; CRC Press: Boca Raton, FL, USA, 2022; pp. 23–41. [ Google Scholar ]
  • Quradaa, F.H.; Shahzad, S.; Almoqbily, R.S. A systematic literature review on the applications of recurrent neural networks in code clone research. PLoS ONE 2024 , 19 , e0296858. [ Google Scholar ] [ CrossRef ]
  • Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning ; MIT Press: Cambridge, MA, USA, 2016. [ Google Scholar ]
  • Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 2016 , 28 , 2222–2232. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Al-Selwi, S.M.; Hassan, M.F.; Abdulkadir, S.J.; Muneer, A.; Sumiea, E.H.; Alqushaibi, A.; Ragab, M.G. RNN-LSTM: From applications to modeling techniques and beyond—Systematic review. J. King Saud-Univ.-Comput. Inf. Sci. 2024 , 36 , 102068. [ Google Scholar ] [ CrossRef ]
  • Zaremba, W.; Sutskever, I.; Vinyals, O. Recurrent neural network regularization. arXiv 2014 , arXiv:1409.2329. [ Google Scholar ]
  • Bai, S.; Kolter, J.Z.; Koltun, V. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv 2018 , arXiv:1803.01271. [ Google Scholar ]
  • Che, Z.; Purushotham, S.; Cho, K.; Sontag, D.; Liu, Y. Recurrent neural networks for multivariate time series with missing values. Sci. Rep. 2018 , 8 , 6085. [ Google Scholar ] [ CrossRef ]
  • Chung, J.; Gulcehre, C.; Cho, K.; Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv 2014 , arXiv:1412.3555. [ Google Scholar ]
  • Badawy, M.; Ramadan, N.; Hefny, H.A. Healthcare predictive analytics using machine learning and deep learning techniques: A survey. J. Electr. Syst. Inf. Technol. 2023 , 10 , 40. [ Google Scholar ] [ CrossRef ]
  • Ismaeel, A.G.; Janardhanan, K.; Sankar, M.; Natarajan, Y.; Mahmood, S.N.; Alani, S.; Shather, A.H. Traffic pattern classification in smart cities using deep recurrent neural network. Sustainability 2023 , 15 , 14522. [ Google Scholar ] [ CrossRef ]
  • Mers, M.; Yang, Z.; Hsieh, Y.A.; Tsai, Y. Recurrent neural networks for pavement performance forecasting: Review and model performance comparison. Transp. Res. Rec. 2023 , 2677 , 610–624. [ Google Scholar ] [ CrossRef ]
  • Chen, Y.; Cheng, Q.; Cheng, Y.; Yang, H.; Yu, H. Applications of recurrent neural networks in environmental factor forecasting: A review. Neural Comput. 2018 , 30 , 2855–2881. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Linardos, V.; Drakaki, M.; Tzionas, P.; Karnavas, Y.L. Machine learning in disaster management: Recent developments in methods and applications. Mach. Learn. Knowl. Extr. 2022 , 4 , 446–473. [ Google Scholar ] [ CrossRef ]
  • Zhang, J.; Liu, H.; Chang, Q.; Wang, L.; Gao, R.X. Recurrent neural network for motion trajectory prediction in human-robot collaborative assembly. CIRP Ann. 2020 , 69 , 9–12. [ Google Scholar ] [ CrossRef ]
  • Tsantekidis, A.; Passalis, N.; Tefas, A. Recurrent Neural Networks. In Deep Learning for Robot Perception and Cognition ; Elsevier: Amsterdam, The Netherlands, 2022; pp. 101–115. [ Google Scholar ]
  • Mienye, I.D.; Jere, N. Deep Learning for Credit Card Fraud Detection: A Review of Algorithms, Challenges, and Solutions. IEEE Access 2024 , 12 , 96893–96910. [ Google Scholar ] [ CrossRef ]
  • Mienye, I.D.; Sun, Y. A machine learning method with hybrid feature selection for improved credit card fraud detection. Appl. Sci. 2023 , 13 , 7254. [ Google Scholar ] [ CrossRef ]
  • Rezk, N.M.; Purnaprajna, M.; Nordström, T.; Ul-Abdin, Z. Recurrent neural networks: An embedded computing perspective. IEEE Access 2020 , 8 , 57967–57996. [ Google Scholar ] [ CrossRef ]
  • Yu, Y.; Adu, K.; Tashi, N.; Anokye, P.; Wang, X.; Ayidzoe, M.A. Rmaf: Relu-memristor-like activation function for deep learning. IEEE Access 2020 , 8 , 72727–72741. [ Google Scholar ] [ CrossRef ]
  • Mienye, I.D.; Ainah, P.K.; Emmanuel, I.D.; Esenogho, E. Sparse Noise Minimization in Image Classification using Genetic Algorithm and DenseNet. In Proceedings of the 2021 Conference on Information Communications Technology and Society (ICTAS), Durban, South Africa, 10–11 March 2021; pp. 103–108. [ Google Scholar ]
  • Ciaburro, G.; Venkateswaran, B. Neural Networks with R: SMART Models Using CNN, RNN, Deep Learning, and Artificial Intelligence Principles ; Packt Publishing Ltd.: Birmingham, UK, 2017. [ Google Scholar ]
  • Nwankpa, C.; Ijomah, W.; Gachagan, A.; Marshall, S. Activation functions: Comparison of trends in practice and research for deep learning. arXiv 2018 , arXiv:1811.03378. [ Google Scholar ]
  • Szandała, T. Review and comparison of commonly used activation functions for deep neural networks. Bio-Inspired Neurocomp. 2021 , 203–224. [ Google Scholar ]
  • Clevert, D.A.; Unterthiner, T.; Hochreiter, S. Fast and accurate deep network learning by exponential linear units (elus). arXiv 2015 , arXiv:1511.07289. [ Google Scholar ]
  • Dubey, S.R.; Singh, S.K.; Chaudhuri, B.B. Activation functions in deep learning: A comprehensive survey and benchmark. Neurocomputing 2022 , 503 , 92–108. [ Google Scholar ] [ CrossRef ]
  • Obaido, G.; Mienye, I.D.; Egbelowo, O.F.; Emmanuel, I.D.; Ogunleye, A.; Ogbuokiri, B.; Mienye, P.; Aruleba, K. Supervised machine learning in drug discovery and development: Algorithms, applications, challenges, and prospects. Mach. Learn. Appl. 2024 , 17 , 100576. [ Google Scholar ] [ CrossRef ]
  • Mienye, I.D.; Sun, Y. Effective Feature Selection for Improved Prediction of Heart Disease. In Proceedings of the Pan-African Artificial Intelligence and Smart Systems Conference, Durban, South Africa, 4–6 December 2021; pp. 94–107. [ Google Scholar ]
  • Martins, A.; Astudillo, R. From Softmax to Sparsemax: A Sparse Model of Attention and Multi-Label Classification. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; pp. 1614–1623. [ Google Scholar ]
  • Bianchi, F.M.; Maiorino, E.; Kampffmeyer, M.C.; Rizzi, A.; Jenssen, R.; Bianchi, F.M.; Maiorino, E.; Kampffmeyer, M.C.; Rizzi, A.; Jenssen, R. Properties and Training in Recurrent Neural Networks. In Recurrent Neural Networks for Short-Term Load Forecasting: An Overview and Comparative Analysis ; Springer: Berlin/Heidelberg, Germany, 2017; pp. 9–21. [ Google Scholar ]
  • Mohajerin, N.; Waslander, S.L. State Initialization for Recurrent Neural Network Modeling of Time-Series Data. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 2330–2337. [ Google Scholar ]
  • Forgione, M.; Muni, A.; Piga, D.; Gallieri, M. On the adaptation of recurrent neural networks for system identification. Automatica 2023 , 155 , 111092. [ Google Scholar ] [ CrossRef ]
  • Zhang, J.; He, T.; Sra, S.; Jadbabaie, A. Why gradient clipping accelerates training: A theoretical justification for adaptivity. arXiv 2019 , arXiv:1905.11881. [ Google Scholar ]
  • Qian, J.; Wu, Y.; Zhuang, B.; Wang, S.; Xiao, J. Understanding Gradient Clipping in Incremental Gradient Methods. In Proceedings of the International Conference on Artificial Intelligence and Statistics, Virtual, 13–15 April 2021; pp. 1504–1512. [ Google Scholar ]
  • Fei, H.; Tan, F. Bidirectional grid long short-term memory (bigridlstm): A method to address context-sensitivity and vanishing gradient. Algorithms 2018 , 11 , 172. [ Google Scholar ] [ CrossRef ]
  • Dong, X.; Chowdhury, S.; Qian, L.; Li, X.; Guan, Y.; Yang, J.; Yu, Q. Deep learning for named entity recognition on Chinese electronic medical records: Combining deep transfer learning with multitask bi-directional LSTM RNN. PLoS ONE 2019 , 14 , e0216046. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Chorowski, J.K.; Bahdanau, D.; Serdyuk, D.; Cho, K.; Bengio, Y. Attention-based models for speech recognition. Adv. Neural Inf. Process. Syst. 2015 , 28 . [ Google Scholar ]
  • Zhou, M.; Duan, N.; Liu, S.; Shum, H.Y. Progress in neural NLP: Modeling, learning, and reasoning. Engineering 2020 , 6 , 275–290. [ Google Scholar ] [ CrossRef ]
  • Naseem, U.; Razzak, I.; Khan, S.K.; Prasad, M. A comprehensive survey on word representation models: From classical to state-of-the-art word representation language models. Trans. Asian Low-Resour. Lang. Inf. Process. 2021 , 20 , 1–35. [ Google Scholar ] [ CrossRef ]
  • Adil, M.; Wu, J.Z.; Chakrabortty, R.K.; Alahmadi, A.; Ansari, M.F.; Ryan, M.J. Attention-based STL-BiLSTM network to forecast tourist arrival. Processes 2021 , 9 , 1759. [ Google Scholar ] [ CrossRef ]
  • Min, S.; Park, S.; Kim, S.; Choi, H.S.; Lee, B.; Yoon, S. Pre-training of deep bidirectional protein sequence representations with structural information. IEEE Access 2021 , 9 , 123912–123926. [ Google Scholar ] [ CrossRef ]
  • Jain, A.; Zamir, A.R.; Savarese, S.; Saxena, A. Structural-rnn: Deep Learning on Spatio-Temporal Graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 5308–5317. [ Google Scholar ]
  • Pascanu, R.; Gulcehre, C.; Cho, K.; Bengio, Y. How to construct deep recurrent neural networks. arXiv 2013 , arXiv:1312.6026. [ Google Scholar ]
  • Shi, H.; Xu, M.; Li, R. Deep learning for household load forecasting—A novel pooling deep RNN. IEEE Trans. Smart Grid 2017 , 9 , 5271–5280. [ Google Scholar ] [ CrossRef ]
  • Gal, Y.; Ghahramani, Z. A theoretically grounded application of dropout in recurrent neural networks. Adv. Neural Inf. Process. Syst. 2016 , 29 . [ Google Scholar ]
  • Moradi, R.; Berangi, R.; Minaei, B. A survey of regularization strategies for deep models. Artif. Intell. Rev. 2020 , 53 , 3947–3986. [ Google Scholar ] [ CrossRef ]
  • Salehin, I.; Kang, D.K. A review on dropout regularization approaches for deep neural networks within the scholarly domain. Electronics 2023 , 12 , 3106. [ Google Scholar ] [ CrossRef ]
  • Cai, S.; Shu, Y.; Chen, G.; Ooi, B.C.; Wang, W.; Zhang, M. Effective and efficient dropout for deep convolutional neural networks. arXiv 2019 , arXiv:1904.03392. [ Google Scholar ]
  • Garbin, C.; Zhu, X.; Marques, O. Dropout vs. batch normalization: An empirical study of their impact to deep learning. Multimed. Tools Appl. 2020 , 79 , 12777–12815. [ Google Scholar ] [ CrossRef ]
  • Borawar, L.; Kaur, R. ResNet: Solving Vanishing Gradient in Deep Networks. In Proceedings of the International Conference on Recent Trends in Computing: ICRTC 2022, Delhi, India, 3–4 June 2022; Springer: Berlin/Heidelberg, Germany, 2023; pp. 235–247. [ Google Scholar ]
  • Mienye, I.D.; Sun, Y. A deep learning ensemble with data resampling for credit card fraud detection. IEEE Access 2023 , 11 , 30628–30638. [ Google Scholar ] [ CrossRef ]
  • Kiperwasser, E.; Goldberg, Y. Simple and accurate dependency parsing using bidirectional LSTM feature representations. Trans. Assoc. Comput. Linguist. 2016 , 4 , 313–327. [ Google Scholar ] [ CrossRef ]
  • Zhang, W.; Li, H.; Tang, L.; Gu, X.; Wang, L.; Wang, L. Displacement prediction of Jiuxianping landslide using gated recurrent unit (GRU) networks. Acta Geotech. 2022 , 17 , 1367–1382. [ Google Scholar ] [ CrossRef ]
  • Cahuantzi, R.; Chen, X.; Güttel, S. A Comparison of LSTM and GRU Networks for Learning Symbolic Sequences. In Proceedings of the Science and Information Conference, Nanchang, China, 2–4 June 2023; Springer: Berlin/Heidelberg, Germany, 2023; pp. 771–785. [ Google Scholar ]
  • Shewalkar, A.; Nyavanandi, D.; Ludwig, S.A. Performance evaluation of deep neural networks applied to speech recognition: RNN, LSTM and GRU. J. Artif. Intell. Soft Comput. Res. 2019 , 9 , 235–245. [ Google Scholar ] [ CrossRef ]
  • Vatanchi, S.M.; Etemadfard, H.; Maghrebi, M.F.; Shad, R. A comparative study on forecasting of long-term daily streamflow using ANN, ANFIS, BiLSTM and CNN-GRU-LSTM. Water Resour. Manag. 2023 , 37 , 4769–4785. [ Google Scholar ] [ CrossRef ]
  • Mateus, B.C.; Mendes, M.; Farinha, J.T.; Assis, R.; Cardoso, A.M. Comparing LSTM and GRU models to predict the condition of a pulp paper press. Energies 2021 , 14 , 6958. [ Google Scholar ] [ CrossRef ]
  • Gers, F.A.; Schmidhuber, J. Recurrent Nets That Time and Count. In Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks, IJCNN 2000, Neural Computing: New Challenges and Perspectives for the New Millennium, Como, Italy, 24–27 July 2000; Volume 3, pp. 189–194. [ Google Scholar ]
  • Gers, F.A.; Schraudolph, N.N.; Schmidhuber, J. Learning precise timing with LSTM recurrent networks. J. Mach. Learn. Res. 2002 , 3 , 115–143. [ Google Scholar ]
  • Jaeger, H. Adaptive nonlinear system identification with echo state networks. Adv. Neural Inf. Process. Syst. 2002 , 15 , 593–600. [ Google Scholar ]
  • Ishaq, M.; Kwon, S. A CNN-Assisted deep echo state network using multiple Time-Scale dynamic learning reservoirs for generating Short-Term solar energy forecasting. Sustain. Energy Technol. Assessments 2022 , 52 , 102275. [ Google Scholar ]
  • Sun, C.; Song, M.; Cai, D.; Zhang, B.; Hong, S.; Li, H. A systematic review of echo state networks from design to application. IEEE Trans. Artif. Intell. 2022 , 5 , 23–37. [ Google Scholar ] [ CrossRef ]
  • Gallicchio, C.; Micheli, A. Deep echo state network (deepesn): A brief survey. arXiv 2017 , arXiv:1712.04323. [ Google Scholar ]
  • Gallicchio, C.; Micheli, A. Richness of Deep Echo State Network Dynamics. In Proceedings of the Advances in Computational Intelligence: 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Gran Canaria, Spain, 12–14 June 2019, Proceedings, Part I 15 ; Springer: Berlin/Heidelberg, Germany, 2019; pp. 480–491. [ Google Scholar ]
  • Hu, R.; Tang, Z.R.; Song, X.; Luo, J.; Wu, E.Q.; Chang, S. Ensemble echo network with deep architecture for time-series modeling. Neural Comput. Appl. 2021 , 33 , 4997–5010. [ Google Scholar ] [ CrossRef ]
  • Gao, R.; Li, R.; Hu, M.; Suganthan, P.N.; Yuen, K.F. Dynamic ensemble deep echo state network for significant wave height forecasting. Appl. Energy 2023 , 329 , 120261. [ Google Scholar ] [ CrossRef ]
  • Gao, R.; Du, L.; Duru, O.; Yuen, K.F. Time series forecasting based on echo state network and empirical wavelet transformation. Appl. Soft Comput. 2021 , 102 , 107111. [ Google Scholar ] [ CrossRef ]
  • Li, S.; Li, W.; Cook, C.; Zhu, C.; Gao, Y. Independently Recurrent Neural Network (indrnn): Building a Longer and Deeper rnn. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 5457–5466. [ Google Scholar ]
  • Yang, J.; Qu, J.; Mi, Q.; Li, Q. A CNN-LSTM model for tailings dam risk prediction. IEEE Access 2020 , 8 , 206491–206502. [ Google Scholar ] [ CrossRef ]
  • Ren, P.; Xiao, Y.; Chang, X.; Huang, P.Y.; Li, Z.; Chen, X.; Wang, X. A comprehensive survey of neural architecture search: Challenges and solutions. ACM Comput. Surv. (CSUR) 2021 , 54 , 1–34. [ Google Scholar ] [ CrossRef ]
  • Mellor, J.; Turner, J.; Storkey, A.; Crowley, E.J. Neural Architecture Search without Training. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 7588–7598. [ Google Scholar ]
  • Zoph, B.; Le, Q.V. Neural architecture search with reinforcement learning. arXiv 2016 , arXiv:1611.01578. [ Google Scholar ]
  • Chen, X.; Wu, S.Z.; Hong, M. Understanding gradient clipping in private sgd: A geometric perspective. Adv. Neural Inf. Process. Syst. 2020 , 33 , 13773–13782. [ Google Scholar ]
  • Zhang, Z. Improved Adam Optimizer for Deep Neural Networks. In Proceedings of the 2018 IEEE/ACM 26th International Symposium on Quality of Service (IWQoS), Banff, AB, Canada, 4–6 June 2018; pp. 1–2. [ Google Scholar ]
  • De Santana Correia, A.; Colombini, E.L. Attention, please! A survey of neural attention models in deep learning. Artif. Intell. Rev. 2022 , 55 , 6037–6124. [ Google Scholar ] [ CrossRef ]
  • Lin, J.; Ma, J.; Zhu, J.; Cui, Y. Short-term load forecasting based on LSTM networks considering attention mechanism. Int. J. Electr. Power Energy Syst. 2022 , 137 , 107818. [ Google Scholar ] [ CrossRef ]
  • Chaudhari, S.; Mithal, V.; Polatkan, G.; Ramanath, R. An attentive survey of attention models. ACM Trans. Intell. Syst. Technol. (TIST) 2021 , 12 , 1–32. [ Google Scholar ] [ CrossRef ]
  • Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. arXiv 2014 , arXiv:1409.0473. [ Google Scholar ]
  • Luong, M.T.; Pham, H.; Manning, C.D. Effective approaches to attention-based neural machine translation. arXiv 2015 , arXiv:1508.04025. [ Google Scholar ]
  • Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. Adv. Neural Inf. Process. Syst. 2017 , 30 . [ Google Scholar ]
  • Marcus, M.P.; Marcinkiewicz, M.A.; Santorini, B. Building a large annotated corpus of English: The Penn Treebank. Comput. Linguist. 1993 , 19 , 313–330. [ Google Scholar ]
  • Maas, A.L.; Daly, R.E.; Pham, P.T.; Huang, D.; Ng, A.Y.; Potts, C. Learning Word Vectors for Sentiment Analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, USA, 19–24 June 2011; pp. 142–150. [ Google Scholar ]
  • LeCun, Y.; Bottou, L.; Bengio, Y.; Haffner, P. Gradient-based learning applied to document recognition. Proc. IEEE 1998 , 86 , 2278–2324. [ Google Scholar ] [ CrossRef ]
  • Garofolo, J.S.; Lamel, L.F.; Fisher, W.M.; Fiscus, J.G.; Pallett, D.S. TIMIT acoustic-phonetic continuous speech corpus. Linguist. Data Consort. 1993 , 93 , 27403. [ Google Scholar ]
  • Lewis, D. Reuters-21578 Text Categorization Test Collection ; Distribution 1.0; AT&T Labs-Research: Atlanta, GA, USA, 1997. [ Google Scholar ]
  • Dua, D.; Graff, C. UCI Machine Learning Repository ; School of Information and Computer Science, University of California: Irvine, CA, USA, 2017. [ Google Scholar ]
  • Lomonaco, V.; Maltoni, D. Core50: A New Dataset and Benchmark for Continuous Object Recognition. In Proceedings of the Conference on Robot Learning. PMLR, Mountain View, CA, USA, 13–15 November 2017; pp. 17–26. [ Google Scholar ]
  • Souri, A.; El Maazouzi, Z.; Al Achhab, M.; El Mohajir, B.E. Arabic Text Generation using Recurrent Neural Networks. In Proceedings of the Big Data, Cloud and Applications: Third International Conference, BDCA 2018, Kenitra, Morocco, 4–5 April 2018 ; Revised Selected Papers 3; Springer: Berlin/Heidelberg, Germany, 2018; pp. 523–533. [ Google Scholar ]
  • Islam, M.S.; Mousumi, S.S.S.; Abujar, S.; Hossain, S.A. Sequence-to-sequence Bangla sentence generation with LSTM recurrent neural networks. Procedia Comput. Sci. 2019 , 152 , 51–58. [ Google Scholar ] [ CrossRef ]
  • Gajendran, S.; Manjula, D.; Sugumaran, V. Character level and word level embedding with bidirectional LSTM–Dynamic recurrent neural network for biomedical named entity recognition from literature. J. Biomed. Inform. 2020 , 112 , 103609. [ Google Scholar ] [ CrossRef ]
  • Hu, H.; Liao, M.; Mao, W.; Liu, W.; Zhang, C.; Jing, Y. Variational Auto-Encoder for Text Generation. In Proceedings of the 2020 IEEE 5th Information Technology and Mechatronics Engineering Conference (ITOEC), Chongqing, China, 12–14 June 2020; pp. 595–598. [ Google Scholar ]
  • Holtzman, A.; Buys, J.; Du, L.; Forbes, M.; Choi, Y. The curious case of neural text degeneration. arXiv 2019 , arXiv:1904.09751. [ Google Scholar ]
  • Yin, W.; Schütze, H. Attentive convolution: Equipping cnns with rnn-style attention mechanisms. Trans. Assoc. Comput. Linguist. 2018 , 6 , 687–702. [ Google Scholar ] [ CrossRef ]
  • Hussein, M.A.H.; Savaş, S. LSTM-Based Text Generation: A Study on Historical Datasets. arXiv 2024 , arXiv:2403.07087. [ Google Scholar ]
  • Baskaran, S.; Alagarsamy, S.; S, S.; Shivam, S. Text Generation using Long Short-Term Memory. In Proceedings of the 2024 Third International Conference on Intelligent Techniques in Control, Optimization and Signal Processing (INCOS), Krishnankoil, India, 14–16 March 2024; pp. 1–6. [ Google Scholar ] [ CrossRef ]
  • Keskar, N.S.; McCann, B.; Varshney, L.R.; Xiong, C.; Socher, R. Ctrl: A conditional transformer language model for controllable generation. arXiv 2019 , arXiv:1909.05858. [ Google Scholar ]
  • Guo, H. Generating text with deep reinforcement learning. arXiv 2015 , arXiv:1510.09202. [ Google Scholar ]
  • Yadav, V.; Verma, P.; Katiyar, V. Long short term memory (LSTM) model for sentiment analysis in social data for e-commerce products reviews in Hindi languages. Int. J. Inf. Technol. 2023 , 15 , 759–772. [ Google Scholar ] [ CrossRef ]
  • Abimbola, B.; de La Cal Marin, E.; Tan, Q. Enhancing Legal Sentiment Analysis: A Convolutional Neural Network–Long Short-Term Memory Document-Level Model. Mach. Learn. Knowl. Extr. 2024 , 6 , 877–897. [ Google Scholar ] [ CrossRef ]
  • Zulqarnain, M.; Ghazali, R.; Aamir, M.; Hassim, Y.M.M. An efficient two-state GRU based on feature attention mechanism for sentiment analysis. Multimed. Tools Appl. 2024 , 83 , 3085–3110. [ Google Scholar ] [ CrossRef ]
  • Pujari, P.; Padalia, A.; Shah, T.; Devadkar, K. Hybrid CNN and RNN for Twitter Sentiment Analysis. In Proceedings of the International Conference on Smart Computing and Communication ; Springer: Berlin/Heidelberg, Germany, 2024; pp. 297–310. [ Google Scholar ]
  • Wankhade, M.; Annavarapu, C.S.R.; Abraham, A. CBMAFM: CNN-BiLSTM multi-attention fusion mechanism for sentiment classification. Multimed. Tools Appl. 2024 , 83 , 51755–51786. [ Google Scholar ] [ CrossRef ]
  • Sangeetha, J.; Kumaran, U. A hybrid optimization algorithm using BiLSTM structure for sentiment analysis. Meas. Sensors 2023 , 25 , 100619. [ Google Scholar ] [ CrossRef ]
  • He, R.; McAuley, J. Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering. In Proceedings of the 25th International Conference on World Wide Web, Montreal, QC, Canada, 11–15 April 2016; pp. 507–517. [ Google Scholar ]
  • Samir, A.; Elkaffas, S.M.; Madbouly, M.M. Twitter Sentiment Analysis using BERT. In Proceedings of the 2021 31st International Conference on Computer Theory and Applications (ICCTA), Kochi, Kerala, India, 17–19 August 2021; pp. 182–186. [ Google Scholar ]
  • Prottasha, N.J.; Sami, A.A.; Kowsher, M.; Murad, S.A.; Bairagi, A.K.; Masud, M.; Baz, M. Transfer learning for sentiment analysis using BERT based supervised fine-tuning. Sensors 2022 , 22 , 4157. [ Google Scholar ] [ CrossRef ]
  • Mujahid, M.; Rustam, F.; Shafique, R.; Chunduri, V.; Villar, M.G.; Ballester, J.B.; Diez, I.d.l.T.; Ashraf, I. Analyzing sentiments regarding ChatGPT using novel BERT: A machine learning approach. Information 2023 , 14 , 474. [ Google Scholar ] [ CrossRef ]
  • Wu, Y.; Schuster, M.; Chen, Z.; Le, Q.V.; Norouzi, M.; Macherey, W.; Krikun, M.; Cao, Y.; Gao, Q.; Macherey, K.; et al. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv 2016 , arXiv:1609.08144. [ Google Scholar ]
  • Sennrich, R.; Haddow, B.; Birch, A. Neural machine translation of rare words with subword units. arXiv 2015 , arXiv:1508.07909. [ Google Scholar ]
  • Kang, L.; He, S.; Wang, M.; Long, F.; Su, J. Bilingual attention based neural machine translation. Appl. Intell. 2023 , 53 , 4302–4315. [ Google Scholar ] [ CrossRef ]
  • Yang, Z.; Dai, Z.; Salakhutdinov, R.; Cohen, W.W. Breaking the softmax bottleneck: A high-rank RNN language model. arXiv 2017 , arXiv:1711.03953. [ Google Scholar ]
  • Song, K.; Tan, X.; Qin, T.; Lu, J.; Liu, T.Y. Mass: Masked sequence to sequence pre-training for language generation. arXiv 2019 , arXiv:1905.02450. [ Google Scholar ]
  • Hinton, G.; Deng, L.; Yu, D.; Dahl, G.E.; Mohamed, A.r.; Jaitly, N.; Senior, A.; Vanhoucke, V.; Nguyen, P.; Sainath, T.N.; et al. Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Process. Mag. 2012 , 29 , 82–97. [ Google Scholar ] [ CrossRef ]
  • Hannun, A.; Case, C.; Casper, J.; Catanzaro, B.; Diamos, G.; Elsen, E.; Prenger, R.; Satheesh, S.; Sengupta, S.; Coates, A.; et al. Deep speech: Scaling up end-to-end speech recognition. arXiv 2014 , arXiv:1412.5567. [ Google Scholar ]
  • Amodei, D.; Ananthanarayanan, S.; Anubhai, R.; Bai, J.; Battenberg, E.; Case, C.; Casper, J.; Catanzaro, B.; Cheng, Q.; Chen, G.; et al. Deep Speech 2: End-to-End Speech Recognition in English and Mandarin. In Proceedings of the International Conference on Machine Learning, New York, NY, USA, 20–22 June 2016; pp. 173–182. [ Google Scholar ]
  • Chiu, C.C.; Sainath, T.N.; Wu, Y.; Prabhavalkar, R.; Nguyen, P.; Chen, Z.; Kannan, A.; Weiss, R.J.; Rao, K.; Gonina, E.; et al. State-of-the-Art Speech Recognition with Sequence-to-Sequence Models. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, Canada, 15–20 April 2018; pp. 4774–4778. [ Google Scholar ]
  • Zhang, Y.; Chan, W.; Jaitly, N. Very Deep Convolutional Networks for End-to-End Speech Recognition. In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), New Orleans, LA, USA, 5–9 March 2017; pp. 4845–4849. [ Google Scholar ]
  • Dong, L.; Xu, S.; Xu, B. Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition. In Proceedings of the 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada, 15–20 April 2018; pp. 5884–5888. [ Google Scholar ]
  • Bhaskar, S.; Thasleema, T. LSTM model for visual speech recognition through facial expressions. Multimed. Tools Appl. 2023 , 82 , 5455–5472. [ Google Scholar ] [ CrossRef ]
  • Daouad, M.; Allah, F.A.; Dadi, E.W. An automatic speech recognition system for isolated Amazigh word using 1D & 2D CNN-LSTM architecture. Int. J. Speech Technol. 2023 , 26 , 775–787. [ Google Scholar ]
  • Dhanjal, A.S.; Singh, W. A comprehensive survey on automatic speech recognition using neural networks. Multimed. Tools Appl. 2024 , 83 , 23367–23412. [ Google Scholar ] [ CrossRef ]
  • Nasr, S.; Duwairi, R.; Quwaider, M. End-to-end speech recognition for arabic dialects. Arab. J. Sci. Eng. 2023 , 48 , 10617–10633. [ Google Scholar ] [ CrossRef ]
  • Kumar, D.; Aziz, S. Performance Evaluation of Recurrent Neural Networks-LSTM and GRU for Automatic Speech Recognition. In Proceedings of the 2023 International Conference on Computer, Electronics & Electrical Engineering & Their Applications (IC2E3), Srinagar Garhwal, India, 8–9 June 2023; pp. 1–6. [ Google Scholar ]
  • Fischer, T.; Krauss, C. Deep learning with long short-term memory networks for financial market predictions. Eur. J. Oper. Res. 2018 , 270 , 654–669. [ Google Scholar ] [ CrossRef ]
  • Nelson, D.M.; Pereira, A.C.; De Oliveira, R.A. Stock Market’s Price Movement Prediction with LSTM Neural Networks. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017; pp. 1419–1426. [ Google Scholar ]
  • Luo, A.; Zhong, L.; Wang, J.; Wang, Y.; Li, S.; Tai, W. Short-term stock correlation forecasting based on CNN-BiLSTM enhanced by attention mechanism. IEEE Access 2024 , 12 , 29617–29632. [ Google Scholar ] [ CrossRef ]
  • Bao, W.; Yue, J.; Rao, Y. A deep learning framework for financial time series using stacked autoencoders and long-short term memory. PLoS ONE 2017 , 12 , e0180944. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Feng, F.; Chen, H.; He, X.; Ding, J.; Sun, M.; Chua, T.S. Enhancing Stock Movement Prediction with Adversarial Training. In Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI-19), Macao, China, 10–16 August 2019; Volume 19, pp. 5843–5849. [ Google Scholar ]
  • Rundo, F. Deep LSTM with reinforcement learning layer for financial trend prediction in FX high frequency trading systems. Appl. Sci. 2019 , 9 , 4460. [ Google Scholar ] [ CrossRef ]
  • Devi, T.; Deepa, N.; Gayathri, N.; Rakesh Kumar, S. AI-Based Weather Forecasting System for Smart Agriculture System Using a Recurrent Neural Networks (RNN) Algorithm. Sustain. Manag. Electron. Waste 2024 , 97–112. [ Google Scholar ]
  • Anshuka, A.; Chandra, R.; Buzacott, A.J.; Sanderson, D.; van Ogtrop, F.F. Spatio temporal hydrological extreme forecasting framework using LSTM deep learning model. Stoch. Environ. Res. Risk Assess. 2022 , 36 , 3467–3485. [ Google Scholar ] [ CrossRef ]
  • Marulanda, G.; Cifuentes, J.; Bello, A.; Reneses, J. A hybrid model based on LSTM neural networks with attention mechanism for short-term wind power forecasting. Wind. Eng. 2023 , 0309524X231191163. [ Google Scholar ] [ CrossRef ]
  • Chen, W.; An, N.; Jiang, M.; Jia, L. An improved deep temporal convolutional network for new energy stock index prediction. Inf. Sci. 2024 , 682 , 121244. [ Google Scholar ] [ CrossRef ]
  • Hasanat, S.M.; Younis, R.; Alahmari, S.; Ejaz, M.T.; Haris, M.; Yousaf, H.; Watara, S.; Ullah, K.; Ullah, Z. Enhancing Load Forecasting Accuracy in Smart Grids: A Novel Parallel Multichannel Network Approach Using 1D CNN and Bi-LSTM Models. Int. J. Energy Res. 2024 , 2024 , 2403847. [ Google Scholar ] [ CrossRef ]
  • Asiri, M.M.; Aldehim, G.; Alotaibi, F.; Alnfiai, M.M.; Assiri, M.; Mahmud, A. Short-term load forecasting in smart grids using hybrid deep learning. IEEE Access 2024 , 12 , 23504–23513. [ Google Scholar ] [ CrossRef ]
  • Yıldız Doğan, G.; Aksoy, A.; Öztürk, N. A Hybrid Deep Learning Model to Estimate the Future Electricity Demand of Sustainable Cities. Sustainability 2024 , 16 , 6503. [ Google Scholar ] [ CrossRef ]
  • Bhambu, A.; Gao, R.; Suganthan, P.N. Recurrent ensemble random vector functional link neural network for financial time series forecasting. Appl. Soft Comput. 2024 , 161 , 111759. [ Google Scholar ] [ CrossRef ]
  • Mienye, E.; Jere, N.; Obaido, G.; Mienye, I.D.; Aruleba, K. Deep Learning in Finance: A Survey of Applications and Techniques. Preprints 2024 . [ Google Scholar ] [ CrossRef ]
  • Mastoi, Q.U.A.; Wah, T.Y.; Gopal Raj, R. Reservoir computing based echo state networks for ventricular heart beat classification. Appl. Sci. 2019 , 9 , 702. [ Google Scholar ] [ CrossRef ]
  • Valin, J.M.; Tenneti, S.; Helwani, K.; Isik, U.; Krishnaswamy, A. Low-Complexity, Real-Time Joint Neural Echo Control and Speech Enhancement Based on Percepnet. In Proceedings of the ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toronto, ON, Canada, 6–11 June 2021; pp. 7133–7137. [ Google Scholar ]
  • Li, Y.; Huang, C.; Ding, L.; Li, Z.; Pan, Y.; Gao, X. Deep learning in bioinformatics: Introduction, application, and perspective in the big data era. Methods 2019 , 166 , 4–21. [ Google Scholar ] [ CrossRef ]
  • Zhang, Y.; Qiao, S.; Ji, S.; Li, Y. DeepSite: Bidirectional LSTM and CNN models for predicting DNA–protein binding. Int. J. Mach. Learn. Cybern. 2020 , 11 , 841–851. [ Google Scholar ] [ CrossRef ]
  • Xu, J.; Mcpartlon, M.; Li, J. Improved protein structure prediction by deep learning irrespective of co-evolution information. Nat. Mach. Intell. 2021 , 3 , 601–609. [ Google Scholar ] [ CrossRef ]
  • Yadav, S.; Ekbal, A.; Saha, S.; Kumar, A.; Bhattacharyya, P. Feature assisted stacked attentive shortest dependency path based Bi-LSTM model for protein–protein interaction. Knowl.-Based Syst. 2019 , 166 , 18–29. [ Google Scholar ] [ CrossRef ]
  • Aybey, E.; Gümüş, Ö. SENSDeep: An ensemble deep learning method for protein–protein interaction sites prediction. Interdiscip. Sci. Comput. Life Sci. 2023 , 15 , 55–87. [ Google Scholar ] [ CrossRef ] [ PubMed ]
  • Li, Z.; Du, X.; Cao, Y. DAT-RNN: Trajectory Prediction with Diverse Attention. In Proceedings of the 2020 19th IEEE International Conference on Machine Learning and Applications (ICMLA), Miami, FL, USA, 14–17 December 2020; pp. 1512–1518. [ Google Scholar ]
  • Lee, M.j.; Ha, Y.g. Autonomous Driving Control Using End-to-End Deep Learning. In Proceedings of the 2020 IEEE International Conference on Big Data and Smart Computing (BigComp), Busan, Republic of Korea, 19–22 February 2020; pp. 470–473. [ Google Scholar ] [ CrossRef ]
  • Codevilla, F.; Müller, M.; López, A.; Koltun, V.; Dosovitskiy, A. End-to-End Driving via Conditional Imitation Learning. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 4693–4700. [ Google Scholar ]
  • Altché, F.; de La Fortelle, A. An LSTM Network for Highway Trajectory Prediction. In Proceedings of the 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), Abu Dhabi, United Arab Emirates, 25–28 October 2017; pp. 353–359. [ Google Scholar ]
  • Li, P.; Zhang, Y.; Yuan, L.; Xiao, H.; Lin, B.; Xu, X. Efficient long-short temporal attention network for unsupervised video object segmentation. Pattern Recognit. 2024 , 146 , 110078. [ Google Scholar ] [ CrossRef ]
  • Li, R.; Shu, X.; Li, C. Driving Behavior Prediction Based on Combined Neural Network Model. IEEE Trans. Comput. Soc. Syst. 2024 , 11 , 4488–4496. [ Google Scholar ] [ CrossRef ]
  • Liu, Y.; Diao, S. An automatic driving trajectory planning approach in complex traffic scenarios based on integrated driver style inference and deep reinforcement learning. PLoS ONE 2024 , 19 , e0297192. [ Google Scholar ] [ CrossRef ]
  • Altindal, M.C.; Nivlet, P.; Tabib, M.; Rasheed, A.; Kristiansen, T.G.; Khosravanian, R. Anomaly detection in multivariate time series of drilling data. Geoenergy Sci. Eng. 2024 , 237 , 212778. [ Google Scholar ] [ CrossRef ]
  • Matar, M.; Xia, T.; Huguenard, K.; Huston, D.; Wshah, S. Multi-Head Attention Based bi-lstm for Anomaly Detection in Multivariate Time-Series of wsn. In Proceedings of the 2023 IEEE 5th International Conference on Artificial Intelligence Circuits and Systems (AICAS), Hangzhou, China, 11–13 June 2023; pp. 1–5. [ Google Scholar ]
  • Kumaresan, S.J.; Senthilkumar, C.; Kongkham, D.; Beenarani, B.; Nirmala, P. Investigating the Effectiveness of Recurrent Neural Networks for Network Anomaly Detection. In Proceedings of the 2024 International Conference on Intelligent and Innovative Technologies in Computing, Electrical and Electronics (IITCEE), Bangalore, India, 24–25 January 2024; pp. 1–5. [ Google Scholar ]
  • Li, E.; Bedi, S.; Melek, W. Anomaly detection in three-axis CNC machines using LSTM networks and transfer learning. Int. J. Adv. Manuf. Technol. 2023 , 127 , 5185–5198. [ Google Scholar ] [ CrossRef ]
  • Minic, A.; Jovanovic, L.; Bacanin, N.; Stoean, C.; Zivkovic, M.; Spalevic, P.; Petrovic, A.; Dobrojevic, M.; Stoean, R. Applying recurrent neural networks for anomaly detection in electrocardiogram sensor data. Sensors 2023 , 23 , 9878. [ Google Scholar ] [ CrossRef ]
  • Zhou, C.; Paffenroth, R.C. Anomaly Detection with Robust Deep Autoencoders. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017; pp. 665–674. [ Google Scholar ]
  • Ren, H.; Xu, B.; Wang, Y.; Yi, C.; Huang, C.; Kou, X.; Xing, T.; Yang, M.; Tong, J.; Zhang, Q. Time-Series Anomaly Detection Service at Microsoft. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 3009–3017. [ Google Scholar ]
  • Munir, M.; Siddiqui, S.A.; Dengel, A.; Ahmed, S. DeepAnT: A deep learning approach for unsupervised anomaly detection in time series. IEEE Access 2018 , 7 , 1991–2005. [ Google Scholar ] [ CrossRef ]
  • Hewamalage, H.; Bergmeir, C.; Bandara, K. Recurrent neural networks for time series forecasting: Current status and future directions. Int. J. Forecast. 2021 , 37 , 388–427. [ Google Scholar ] [ CrossRef ]
  • Ahmed, S.F.; Alam, M.S.B.; Hassan, M.; Rozbu, M.R.; Ishtiak, T.; Rafa, N.; Mofijur, M.; Shawkat Ali, A.; Gandomi, A.H. Deep learning modelling techniques: Current progress, applications, advantages, and challenges. Artif. Intell. Rev. 2023 , 56 , 13521–13617. [ Google Scholar ] [ CrossRef ]
  • Li, X.; Qin, T.; Yang, J.; Liu, T.Y. LightRNN: Memory and computation-efficient recurrent neural networks. Adv. Neural Inf. Process. Syst. 2016 , 29 . [ Google Scholar ]
  • Katharopoulos, A.; Vyas, A.; Pappas, N.; Fleuret, F. Transformers Are rnns: Fast Autoregressive Transformers with Linear Attention. In Proceedings of the International Conference on Machine Learning, Virtual, 12–18 July 2020; pp. 5156–5165. [ Google Scholar ]
  • Shao, W.; Li, B.; Yu, W.; Xu, J.; Wang, H. When Is It Likely to Fail? Performance Monitor for Black-Box Trajectory Prediction Model. IEEE Trans. Autom. Sci. Eng. 2024 , 4 , 765–772. [ Google Scholar ] [ CrossRef ]
  • Jacobs, W.R.; Kadirkamanathan, V.; Anderson, S.R. Interpretable deep learning for nonlinear system identification using frequency response functions with ensemble uncertainty quantification. IEEE Access 2024 , 12 , 11052–11065. [ Google Scholar ] [ CrossRef ]
  • Mamalakis, M.; Mamalakis, A.; Agartz, I.; Mørch-Johnsen, L.E.; Murray, G.; Suckling, J.; Lio, P. Solving the enigma: Deriving optimal explanations of deep networks. arXiv 2024 , arXiv:2405.10008. [ Google Scholar ]
  • Shah, M.; Sureja, N. A Comprehensive Review of Bias in Deep Learning Models: Methods, Impacts, and Future Directions. Arch. Comput. Methods Eng. 2024 , 1–13. [ Google Scholar ] [ CrossRef ]
  • Goethals, S.; Calders, T.; Martens, D. Beyond Accuracy-Fairness: Stop evaluating bias mitigation methods solely on between-group metrics. arXiv 2024 , arXiv:2401.13391. [ Google Scholar ]
  • Weerts, H.; Pfisterer, F.; Feurer, M.; Eggensperger, K.; Bergman, E.; Awad, N.; Vanschoren, J.; Pechenizkiy, M.; Bischl, B.; Hutter, F. Can fairness be automated? Guidelines and opportunities for fairness-aware AutoML. J. Artif. Intell. Res. 2024 , 79 , 639–677. [ Google Scholar ] [ CrossRef ]
  • Bai, Y.; Geng, X.; Mangalam, K.; Bar, A.; Yuille, A.L.; Darrell, T.; Malik, J.; Efros, A.A. Sequential Modeling Enables Scalable Learning for Large Vision Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle WA, USA, 17–21 June 2024; pp. 22861–22872. [ Google Scholar ]
  • Taye, M.M. Understanding of machine learning with deep learning: Architectures, workflow, applications and future directions. Computers 2023 , 12 , 91. [ Google Scholar ] [ CrossRef ]

Click here to enlarge figure

ReferenceYearDescription
Zaremba et al. [ ]2014Insights into RNNs in language modeling
Chung et al. [ ]2014Survey of advancements in RNN training, optimization, and architectures
Goodfellow et al. [ ]2016Review on deep learning, including RNNs
Greff et al. [ ]2016Extensive comparison of LSTM variants
Tarwani et al. [ ]2017In-depth analysis of RNNs in NLP
Chen et al. [ ]2018Effectiveness of RNNs in environmental monitoring and climate modeling
Bai et al. [ ]2018Comparison of RNNs with other sequence modeling techniques like CNNs and attention mechanisms
Che et al. [ ]2018Potential of RNNs in medical applications
Zhang et al. [ ]2020RNN applications in robotics, including path planning, motion control, and human–robot interaction
Dutta et al. [ ]2022Overview of RNNs, challenges in training, and advancements in LSTM and GRU for sequence learning
Linardos et al. [ ]2022RNNs for early warning systems, disaster response, and recovery planning in natural disaster prediction
Badawy et al. [ ]2023Integration of RNNs with other ML techniques for predictive analytics and patient monitoring in healthcare
Ismaeel et al. [ ]2023Application of RNNs in smart city technologies, including traffic prediction, energy management, and urban planning
Mers et al. [ ]2023Performance comparison of various RNN models in pavement performance forecasting
Quradaa et al. [ ]2024Start-of-the-art review of RNNs, covering core architectures with a focus on applications in code clones
Al-Selwi et al. [ ]2024Review of LSTM applications from 2018 to 2023
RNN TypeKey FeaturesGradient StabilityTypical Applications
Basic RNNSimple structure with short-term memoryHigh risk of vanishing gradientsSimple sequence tasks like text generation
LSTMLong-term memory with input, forget, and output gatesStable, handles vanishing gradients wellLanguage translation, speech recognition
GRUSimplified LSTM with fewer gatesStable, handles vanishing gradients effectivelyTasks requiring faster training than LSTM
Bidirectional RNNProcesses data in both forward and backward directions for better context understandingMedium stability, depends on depthSpeech recognition and sentiment analysis
Deep RNNMultiple RNN layers are stacked to learn hierarchical featuresVariable, and the risk of vanishing gradients increases with depthComplex sequence modeling like video processing
ESNFixed hidden layer weights, trained only at the outputNot applicable as training bypasses typical gradient issuesTime series prediction and system control
Peephole LSTMAdds peephole connections to LSTM gatesStable and similar to LSTMRecognition of complex temporal patterns like musical notation
IndRNNAllows training of deeper networks by maintaining independence between time stepsReduces risk of vanishing and exploding gradientsVery long sequences, such as in video processing or long text generation
Dataset NameApplicationDescription
Penn
Treebank [ ]
Natural
language processing
A corpus of English sentences annotated for part-of-speech tagging, parsing, and named entity recognition; widely used for language modeling with RNNs
IMDB
Reviews [ ]
Sentiment analysisA dataset of movie reviews used for binary sentiment classification; suitable for studying the effectiveness of RNNs in text sentiment classification tasks
MNIST
Sequential [ ]
Image recognitionA version of the MNIST dataset formatted as sequences for studying sequence-to-sequence learning with RNNs
TIMIT Speech
Corpus [ ]
Speech recognitionAn annotated speech database used for automatic speech recognition systems
Reuters-21578
Text
Categorization
Collection [ ]
Text categorizationA collection of newswire articles that is a common benchmark for text categorization and NLP tasks with RNNs
UCI ML Repository: Time Series Data [ ]Time series analysisContains various time series datasets, including stock prices and weather data, ideal for forecasting with RNNs.
CORe50 Dataset [ ]Object RecognitionUsed for continuous object recognition, ideal for RNN models dealing with video input sequences where object persistence and temporal context are important
Application DomainReferenceYearMethods and Application
Text generationSouri et al. [ ]2018RNNs for generating coherent and contextually relevant Arabic text
Holtzman et al. [ ]2019Controlled text generation using RNNs for style and content control
Hu et al. [ ]2020VAEs combined with RNNs to enhance creativity in text generation
Gajendran et al. [ ]2020Character-level text generation using BiLSTM for various tasks
Hussein and Savas [ ]2024LSTM for text generation
Baskaran et al. [ ]2024LSTM for text generation, achieving excellent performance
Islam [ ]2019Sequence-to-sequence framework using LSTM for improved text generation quality
Yin et al. [ ]2018Attention mechanisms with RNNs for improved text generation quality
Guo [ ]2015Integration of reinforcement learning with RNNs for text generation
Keskar et al. [ ]2019Conditional Transformer Language (CTRL) for generating text in various styles
Sentiment analysisHe and McAuley [ ]2016Adversarial training framework for robustness in sentiment analysis
Pujari et al. [ ]2024Hybrid CNN-RNN model for sentiment classification
Wankhade et al. [ ]2024Fusion of CNN and BiLSTM with attention mechanism for sentiment classification
Sangeetha and Kumaran [ ]2023BiLSTM for sentiment analysis by processing text in both directions
Yadav et al. [ ]2023LSTM-based models for sentiment analysis in customer reviews and social media posts
Zulqarnain et al. [ ]2024Attention mechanisms and GRU for enhanced sentiment analysis
Samir et al. [ ]2021Use of pre-trained models like BERT for sentiment analysis
Prottasha et al. [ ]2022Transfer learning with BERT and GPT for sentiment analysis
Abimbola et al. [ ]2024Hybrid LSTM-CNN model for document-level sentiment classification
Mujahid et al. [ ]2023Analyzing sentiment with pre-trained models fine-tuned for specific tasks
Machine TranslationSennrich et al. [ ]2015Byte-Pair Encoding for handling rare words in translation models
Wu et al. [ ]2016Google Neural Machine Translation with deep RNNs for improved accuracy
Vaswani et al. [ ]2017Fully attention-based transformer models for superior translation performance
Yang et al. [ ]2017Hybrid model integrating RNNs into the transformer architecture
Song et al. [ ]2019Incorporating BERT into translation models for enhanced understanding and fluency
Kang et al. [ ]2023Bilingual attention-based machine translation model combining RNN with attention
Zulqarnain et al. [ ]2024Multi-stage feature attention mechanism model using GRU
Application DomainReferenceYearMethods and Application
Hinton et al. [ ]2012Deep neural networks, including RNNs, for speech-to-text systems
Hannun et al. [ ]2014DeepSpeech: LSTM-based speech recognition system
Amodei et al. [ ]2016DeepSpeech2: Enhanced LSTM-based speech recognition with bidirectional RNNs
Zhang et al. [ ]2017Convolutional RNN for robust speech recognition
Chiu et al. [ ]2018RNN-transducer models for end-to-end speech recognition
Dong et al. [ ]2018Speech-Transformer: Leveraging self-attention for better processing of audio sequences
Bhaskar and Thasleema [ ]2023LSTM for visual speech recognition using facial expressions
Daouad et al. [ ]2023Various RNN variants for automatic speech recognition
Nasr et al. [ ]2023End-to-end speech recognition using RNNs
Kumar et al. [ ]2023Performance evaluation of RNNs in speech recognition tasks
Dhanjal et al. [ ]2024Comprehensive study of different RNN models for speech recognition
Nelson et al. [ ]2017Hybrid CNN-RNN model for stock price prediction
Bao et al. [ ]2017Combining LSTM with stacked autoencoders for financial time series forecasting
Fischer and Krauss [ ]2018Deep RNNs for predicting stock returns, outperforming traditional ML models
Feng et al. [ ]2019Transfer learning with RNNs for stock prediction
Rundo [ ]2019Combining reinforcement learning with LSTM for trading strategy development
Devi et al. [ ]2024RNN-based model for weather prediction and capturing sequential dependencies in meteorological data
Anshuka et al. [ ]2022LSTM networks for predicting extreme weather events by learning complex temporal patterns
Lin et al. [ ]2022Integrating attention mechanisms with LSTM for enhanced weather forecasting accuracy
Marulanda et al. [ ]2023LSTM model for short-term wind power forecasting and improving prediction accuracy
Chen et al. [ ]2024Bidirectional GRU with TCNs for energy time series forecasting
Hasanat et al. [ ]2024RNNs for forecasting energy demand in smart grids and optimizing renewable energy integration
Asiri et al. [ ]2024Short-term renewable energy predictions using RNN-based models
Yildiz et al. [ ]2024Hybrid model of LSTM with CNN for accurate electricity demand prediction
Luo et al. [ ]2024Attention-based CNN-BiLSTM model for improved financial forecasting
Gao et al. [ ]2023Dynamic ensemble deep ESN for wave height forecasting
Bhambu et al. [ ]2024Recurrent ensemble deep random vector functional link neural network for financial time series forecasting
Application DomainReferenceYearMethods and Application
Signal processingMastoi et al. [ ]2019ESNs for real-time heart rate variability monitoring
Valin et al. [ ]2021ESNs for speech signal enhancement in noisy environments
Gao et al. [ ]2021EWT integrated with ESNs for enhanced time series forecasting
BioinformaticsLi et al. [ ]2019RNNs for gene prediction and protein-structure prediction
Zhang et al. [ ]2020Bidirectional LSTM for predicting DNA-binding protein sequences
Xu et al. [ ]2021RNN-based model for predicting protein secondary structures
Yadav et al. [ ]2019Combining BiLSTM with CNNs for protein sequence analysis
Aybey et al. [ ]2023Ensemble model for predicting protein–protein interactions
Autonomous vehiclesAltché and de La Fortelle [ ]2017LSTM for predicting the future trajectories of vehicles
Codevilla et al. [ ]2018RNNs with imitation learning for autonomous driving
Li et al. [ ]2020RNNs for path planning and object detection
Lee et al. [ ]2020Integrating LSTM with CNN for end-to-end autonomous driving
Li et al. [ ]2024Attention-based LSTM for video object tracking
Liu and Diao [ ]2024GRU with deep reinforcement learning for decision-making
Anomaly detectionZhou and Paffenroth [ ]2017RNNs in unsupervised anomaly detection with deep autoencoders
Munir et al. [ ]2018Hybrid CNN-RNN model for anomaly detection in time series
Ren et al. [ ]2019Attention-based RNN model for anomaly detection
Li et al. [ ]2023RNNs with Transfer learning for anomaly detection in manufacturing
Mini et al. [ ]2023RNNs for detecting anomalies in ECG signals
Matar et al. [ ]2023BiLSTM for anomaly detection in multivariate time series
Kumaresan et al. [ ]2024RNNs for detecting network traffic anomalies
Altindal et al. [ ]2024LSTM for anomaly detection in time series data
The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

Mienye, I.D.; Swart, T.G.; Obaido, G. Recurrent Neural Networks: A Comprehensive Review of Architectures, Variants, and Applications. Information 2024 , 15 , 517. https://doi.org/10.3390/info15090517

Mienye ID, Swart TG, Obaido G. Recurrent Neural Networks: A Comprehensive Review of Architectures, Variants, and Applications. Information . 2024; 15(9):517. https://doi.org/10.3390/info15090517

Mienye, Ibomoiye Domor, Theo G. Swart, and George Obaido. 2024. "Recurrent Neural Networks: A Comprehensive Review of Architectures, Variants, and Applications" Information 15, no. 9: 517. https://doi.org/10.3390/info15090517

Article Metrics

Article access statistics, further information, mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

15+ Neural Network Projects Ideas for Beginners to Practice 2024

Simple, Cool, and Fun Neural Network Projects Ideas to Practice in 2024 to learn deep learning and master the concepts of neural networks.

15+ Neural Network Projects Ideas for Beginners to Practice 2024

A curated list of interesting, simple, and cool neural network project ideas for beginners and professionals looking to make a career transition into machine learning or deep learning in 2023.

data_science_project

Deep Learning Project for Text Detection in Images using Python

Downloadable solution code | Explanatory videos | Tech Support

Table of Contents

Top 15+ neural network projects ideas for 2024, neural network projects for beginners to practice in 2024, beginner neural network projects.

  • Easy Neural Network Projects for Intermediate Professionals
  • Advanced Well Known Neural Network Projects

Cool Neural Network Projects in Biotechnology

  • Artificial Neural Network projects on Github

Graph Neural Network Projects on Github

  • Recurrent Neural Network Projects on GitHub 

CNN Projects on Github

Practice more with projectpro to enhance your understanding of neural networks.

Before we delve into these simple projects to do in neural networks , it’s significant to understand what exactly are neural networks.

Neural networks are changing the human-system interaction and are coming up with new and advanced mechanisms of problem-solving, data-driven predictions, and decision-making. Warren McCulloch and Walter Pitts came up with the first neural network back in 1943. Massive developments have occurred in the field of AI since then and neural networks are now applicable in numerous industrial and business applications.

What is a Simple Neural Network?

Neural networks refer to the series of algorithms implemented to determine the relationships between the datasets using a process that is in line with the operations of a human brain. These networks can adapt as per the changes in the input so that the best results can be obtained without making changes in the output criteria. Neural Networks have their roots in Artificial Intelligence and Machine Learning (ML) technologies.

Structure of a Neural Network

 Structure of a Neural Network Image Credit: medium.com

The neural network technique resembles the neural operations of a human brain. Neuron in this case is a mathematical function that gathers the information and classifies the same as per a specific architecture. The network includes several nodes distributed in layers and these are interconnected with one another. Each node present in a neural network is a perceptron and it is similar to multiple linear regression .

Neural networks enable non-linear process modeling and it is one of the primary reasons for the immense popularity of the technology. It makes neural networks extremely useful for problem-solving, such as regression , pattern recognition , clustering , anomaly detection , and more.

ProjectPro Free Projects on Big Data and Data Science

Applications of Neural Networks

Neural networks have a plethora of applications in numerous domains specifically for data-intensive applications. Financial forecasting, targeted marketing, credit scoring, fraud detection, machine diagnostics, healthcare diagnostics, etc. are some of the applications in finance , marketing, manufacturing, and healthcare domains. Neural networks can compress and analyze massive information pieces quickly making them relevant for usage in most of the industrial and business sectors.

Why building Neural Network Projects is the best way to learn deep learning?

Neural networks have several advantages attached to them making the neural networking projects extremely important for the domains they are used in. A deep neural network can easily model non-linear and complex relationships resulting in vast applicability as a majority of real-world scenarios meet these criteria. These networks can also model and make predictions on the unseen data sets.  Another significance of implementing neural network projects is that they do not impose any restrictions or constraints on the input variables. These properties add to the importance of neural network projects.

neural network projects

The technologies, such as neural networks cannot be mastered in a single day. It is essential to acquire theoretical learning of the concept with an understanding of the key components, architecture, types, and applications. However, in neural networks or other AI/ML technologies, one can acquire and improve the skills only with the practical implementation of these technologies. The simple and cool neural network and deep learning project ideas we have covered can provide the practical experience of working on neural networks. The real-world problems and their resolution while working on these projects will assist in the development and enhancement of the skill-set.

Here's what valued users are saying about ProjectPro

user profile

Savvy Sahai

Data Science Intern, Capgemini

user profile

Graduate Research assistance at Stony Brook University

Not sure what you are looking for?

Let us now discuss the neural network projects that discuss the applications of this deep learning model across various industries.

This section contains simple neural network projects for newbies in the domain of machine learning and deep learning.

1) Neural Network Projects in Cryptographic Applications

You can explore a wide range of applications in Cryptography using deep learning models such as neural networks. For instance, we have proposed a sequential machine using the Jordan network. Activation values of the output units in the case of a Jordan network are fed again in the input layer. The process is given shape using additional input units referred to as the state units. You can design such an application for encryption and decryption involving state diagrams and tables. State table will provide you with a training set and the input will comprise all the possible inputs and states. The output will include the encrypted/decrypted output along with the next state.

Neural Network Projects in Cryptographic Applications

You can also use neural networks to develop cryptographic applications with the chaotic network. Chaotic neural networks are the ones that have the weights depending on a chaotic sequence. This sequence is dependent on the initial conditions and the associated parameters, x, and μ. The application will be exceptional from the security aspect as decryption of the data sets will be too complex without the details of x and μ.

Get Hands-on Experience with Deep Learning techniques and Neural Networks - Practice Solved Neural Network Projects

2) Neural Network Project for Credit Scoring System

The banking and finance sector has transformed significantly over the past few years with the incorporation of advanced technology and digital systems. You can use neural networks to develop an intelligent credit scoring system for the banks. Loan defaulters and fraudulent entities are required to be managed to avoid financial losses. Banking institutes dedicate massive resources to determine credit risks; however, such issues have not slowed down. Neural networks can be exceptionally useful in developing smart alternatives to traditional credit scoring systems.

Neural Network Project for Credit Scoring System

These applications can offer improved predictive ability with higher accuracy and classification abilities. You can use the techniques, such as logistic regression and discriminant analysis to design and develop a neural networks-based credit scoring system. We have simplified the steps that you may follow to develop this system:

In the first step, you shall extract real-world credit information for detailed analytics

You shall then identify the structure of the neural networks to be included. This may include radial-basis functions or it may also include a combination of experts

Continuous efforts shall be made to reduce the total errors and it must be performed by specifying the weights

Give details on the optimization technique or theory

To further improve your outcomes, compare the decision support system you include with the other credit scoring systems implemented by the financial institutions or banks.

New Projects

3) Neural Network Project on Automatic Music Generation System

Like we stated earlier, the scope for neural networks is massive. It can be used to develop basic to advanced applications in any domain. One such project we have covered is the automatic music generation system. You can make real music without any background knowledge on playing the instruments. You may develop these systems for leisure or professional music generation. You can use MIDI file data to develop these applications and can also build an LSTM model to come up with new and interesting compositions.

Neural Network Project on Automatic Music Generation System

With deep neural networks, you can program varied methods to learn and discover a wide range of patterns. These may include different music styles and harmonies. Based on the patterns, neural networks can predict the next tokens to develop rhythmic compositions.

You can combine different instruments and music forms to discover interesting music pieces.

Neural Network Projects with Python for Intermediate Professionals

This section has good neural network projects for professionals who have a few years of experience with these deep learning techniques.

4) Neural Network Project for Vision and Control in Autonomous Flying Vehicle

The autonomous industry has seen massive transformation in recent years. However, one of the major debates still doing rounds is the rational decision-making abilities of semi-autonomous and fully autonomous vehicles. We have proposed a neural network-based vision and control system for an autonomous flying vehicle. Sensing abilities in insects to detect image motion or optic flow is very good. We have suggested this idea using the biologically inspired approach to improve the operations and safety of vehicles for autonomous driving applications. You can develop a biological vision and control system for autonomous flying vehicles using deep learning architectures such as neural networks.

Sensor technology is a key technology in this case too and it shall be combined with the neural networks to measure the optic flow. You can use extended Kalman filtering to accurately determine altitude and velocity from the sensors and optic flow measurements. The inclusion of neural networks with these sensors will enable your application to be computationally intensive. Neural networks can carry out the functions with minimal computing hardware possible to be included in the design of small unmanned vehicles, such as drones.

5) Neural Network Project for Global Positioning Recommended Minimum (GPRMC) Integrity Check

Neural networks can be used to determine the integrity of GPRMC sentences. These are the common sentences transmitted by the Global Positioning System, GPS devices. GPS has now become an extremely widely used service and this project will allow you to target a wider audience. In this interesting neural network project, you can compare the data integrity through classification accuracy and can determine the minimum error using neural networks.

You will be required to process the data before the training data and patterns are presented to the network. C# is the programming language most suitable to carry out the pre-processing of the data sets. BackPropagation (BP) feedforward is the neural network algorithm applicable to design the application with momentum and learning rate as the two key parameters.

6) Neural Network Project on Handwriting Recognition Tool using Autoencoders

Before we describe the project idea, you need to understand the concept of autoencoders. These are the simplest forms of deep learning architectures. In these networks, input is primarily compressed in a lower-dimensional code. The next step involves reconstructing the output using compact code representation. Autoencoders are also referred to as feed-forward neural networks. This is because these have three individual components built within them. Encoder, code, and decoder are three are these components.

Neural Network Project on Handwriting Recognition Tool using Autoencoders

In these neural networks, input passes through the encoder and then produces the code. This code is then used by the decoder to process the output. Output is obtained in this process and it is identical to the input. We have included these details to give insight into the functioning of the autoencoder. You must have observed that the autoencoder is a dimensionality compression algorithm. To develop the project, you will require an encoding method, loss function, and a decoding method. We suggest you either use binary cross-entropy or mean squared error as the two options for loss function. The backpropagation method can be used to train the autoencoders.

The entire process applies to a variety of applications. You can develop a handwriting recognition tool using the concept. Autoencoders can be applicable to recognize the patterns and extract features using greedy layer-wise functions.

Explore Enterprise-Grade Data Science Projects for Resume Building and Ace your Next Job Interview!

7) Neural Network Project on Stock Market Value Prediction System using RNN

Recurrent neural networks, RNNs can manage sequences of varying lengths. Such is not the capability offered by feedforward networks. We have proposed a stock market value prediction system using RNNs. You can train these neural networks to predict the sequence by processing real-data sequences one after the other.

Neural Network Project on Stock Market Value Prediction System using RNN

You can predict the stock value of any organization using the stock share value history, company data, and market details. The sequences can be predicted to determine the target price of the stocks, short-term, and long-term statistics. Stop-loss prediction is also possible with the application of the RNNs.

8) Neural Network Project for Web-based Training System

The outbreak of the Covid-19 pandemic has increased the demand for web-based applications and systems. One such spike in demand is seen in the education, learning, and training fields. Web-based training and learning platforms have proved to be very effective at academic and professional levels.  You can use neural networks to develop intelligent web-based training/education systems. Traditional learning and training solutions are gradually being replaced with automated counterparts. One of the essential attributes of such systems is the real-time interactivity of these systems. With neural networks, you can develop an interactive learning environment and can also integrate it with sophisticated tools for better learning. For example, you can use user modeling to personalize content for the end-users. You can also integrate your systems with intelligent agents. Neural networks can also be used to develop intelligent back-end combined with case-based reasoning.

Advanced Well Known Neural Network Project Ideas

This section contains cool neural network projects with Python and are meant for individuals who have mastered their implementation.

9) Neural Network Project to Build a Vehicle Security System

Security systems are a must for vehicles and it is essential to have smart systems implemented to eliminate any chances of security lapse. You can design a neural network-based vehicle security system combining the techniques of facial recognition and optics.

Neural Network Project to Build a Vehicle Security System

These systems can preserve vehicle security and can also predict the security risks to the vehicles. For example, you can use the vehicle location to predict the probability of vehicle stealing based on the reported cases from the area. Facial recognition will ensure uncompromised authentication of the vehicle owners and drivers. You can combine multi-level wavelet decomposition with neural networks to develop such systems.

10) Neural Network Project on Gender Recognition Systems

Virtual assistants are now being used by users from all across the globe. These virtual assistants utilize the speech recognition technology to recognize the human voice correctly for smooth interaction. Neural networks can be used to design gender recognition systems and these can be integrated with smart virtual assistants.  You can use neural networks to extract the features from the human voice. Virtual assistants can then use the real-time feature extraction and analysis to modify the addressing scheme accordingly.

Explore Categories

11) Neural Network Project Idea to Implement a Text Summarizer

You can combine neural networks with natural language processing to develop this application. Automatic text summarization is the process that includes condensing text in a shorter version. Manual processing of writing and developing the summaries is time-consuming. With automatic text summarizers, you can summarize massive pieces of texts used in academic research and analysis.

Neural Network Project Idea to Implement a Text Summarizer

You can further specialize this application by including two categories of text summarization . Extractive summarization can be one and it will pick up sentences directly from the documents fed as inputs. Abstractive summarization is the other that can provide a bottom-up summary through sentences and verbal annotations. This may not be included in the original document(s).

12 Neural Network Project to Build Intelligent Chatbots

Web presence is now a must for businesses to reach a wider target audience. Customer service is an important part of business operations and web-based systems need to be able to carry out streamlined customer service. Chatbots developed with the help of natural language processing techniques are now deployed on the websites and web portals of the organizations to provide instant replies to the customers.

Neural Network Project to Build Intelligent Chatbots

You can use neural networks to develop intelligent and interactive chatbots. These bots can identify specific queries of the customers related to the nature of the business and can provide relevant answers. To develop these chatbots , you can use generative models based on neural networks. These will not need any predefined responses.

You can deploy these smart chatbots for retail applications , learning portals, banking institutions, or any other business with a web presence. The chatbots will analyze the query placed by the customer and extract the keywords. The response will be immediately provided to the user without any waiting time. It will have major improvements in the customer experience.

13) Neural Network Project on Human Activity Recognition for Senior Citizens

We have proposed this simple project idea based on neural network technology for assistive care. The application can be used for human pose estimation for poses, such as sitting on the sofa, opening the door, closing the door, falling, and likewise.  Many senior citizens now live alone and are at major health risks. For example, people infected with Dementia often forget their medications, way back to their homes, etc. Similarly, senior citizens alone at home may trip and fall. They may hurt themselves significantly. You can develop these computer vision based human activity recognition systems combining a series of images and classifying the actions.

14) Neural Network Project on Sentiment Analysis

In this project, the goal is to create a system that analyzes social media posts and categorizes them based on their sentiment. The system would use a neural network trained on a large dataset of labeled social media posts to identify patterns and relationships between words and sentiments. The network could then predict the sentiment of new, unlabeled posts with a high degree of accuracy. This system could be useful for businesses looking to monitor their brand reputation or for social media platforms looking to filter out potentially harmful content. It could also be applied to other forms of text data, such as customer reviews or news articles.

15) Neural Network Project on Self-Driving Cars

Utilizing neural networks for self-driving cars involves creating a system that can use image processing to perform object detection and classify different objects and obstacles in the environment to make safe and accurate decisions while driving. The system would use a combination of convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to process real-time data from sensors such as cameras, lidars, and radars. The CNNs would be responsible for detecting and classifying objects in the environment, while the RNNs would use this information to predict the trajectory and behavior of these objects. This system could be implemented in autonomous vehicles to improve their navigation and decision-making capabilities, ultimately leading to safer roads and more efficient transportation systems. If you are looking for convolutional neural network projects, then do not miss out this one.

Unlock the ProjectPro Learning Experience for FREE

This section contains a list of fun neural network projects that utilize these amazing deep learning architectures in the biotechnology field.

16) Neural Network Project to Build a Cancer Detection System

Healthcare is one sector that you can explore for the implementation of the latest technology. Smart healthcare is now being launched and promoted by numerous countries across the globe.

Neural Network Project to Build a Cancer Detection System

You can use neural networks to develop a cancer detection system. These networks can improve efficiency in medical diagnosis. There is a lot of variation in the cancer cells and the diagnosis varies from one patient to the other. With the multi-tiered neural network, you will be able to differentiate between malignant and non-malignant tissues. The system will also enable you to combine the other medical information of the patient for improved diagnosis and treatment.

17) Neural Network Project for Stress Diagnosis - Skin Conductance Sensor Signals

Stress has become extremely common with 33% of people suffering from extreme stress. Technology can be useful in measuring and managing stress. You can combine sensor technology and artificial neural networks to develop an intelligent stress diagnosis system. Diagnosis of stressful events and triggers is not easy with solely relying on sensor technology, such as skin conductance (SC) and finger temperature (FT). Neural networks combined with these will assist you in identifying the hidden patterns in the data.

Neural Network Project for Stress Diagnosis - Skin Conductance Sensor Signals

The application of neural networks with case-based reasoning will lead to precise feature extraction from the data sets. The angle of change can be acquired from the FT slope and SC slope signal and it can be amalgamated with SC response in a specified period. Features extracted can then be mapped with the old cases in the data library. Neural networks will automatically assign the weights to the features to obtain an outcome.

Artificial Neural Network projects with Source Code on Github

This section has artificial neural network projects with source code on GitHub.

Handwritten Digit Recognition

The MNIST dataset is a popular dataset among deep learning enthusiasts. In this project, you can use the artificial neural network algorithm to identify which number is represented by the input image. You can also use KNN, Multi-layer perceptron, and SVM algorithms for this task and then compare the results of all four algorithms.

GitHub Repository: https://github.com/OSSpk/Handwritten-Digits-Classification-Using-KNN-Multiclass_Perceptron-SVM   

Customer Churn Prediction

To build a neural network project for customer churn prediction, the first step is to gather and preprocess customer data, such as demographics, purchase history, and customer service interactions. Then, design and train a neural network model to predict which customers are most likely to churn. Finally, evaluate the performance of the model and refine it as needed to achieve the highest accuracy possible.

GitHub Repository: https://github.com/TatevKaren/artificial-neural-network-business_case_study  

Predicting the Purhcasing Intention

This neural network project aims to predict whether a customer will make a purchase based on their browsing behavior and other characteristics. The project uses a neural network algorithm and features such as time spent on page, number of pages viewed, and device type to make predictions. The dataset used in the project is publicly available and the code is written in Python using TensorFlow and Keras libraries. 

GitHub Repository: https://github.com/kb22/Purchasing-Intention-Prediction  

Get confident to build end-to-end projects

Access to a curated library of 250+ end-to-end industry projects with solution code, videos and tech support.

If you are looking for fun neural network project ideas for beginners that utilize graph neural networks, then check out the projects listed below.

Fraud Detection

Losing money in fraudulent transactions is a problem for many businesses. But, with machine learning and deep learning based fraud detection systems, the situation for business owners can be improved. In this project, you will utilize graph-based algorithms for fraud detection in financial transactions. The project involves preprocessing transactional data and constructing a graph network of the transactions, where the nodes represent the accounts and the edges represent the transactions. Then, the graph neural network algorithm is used to identify suspicious transactions and accounts. 

GitHub Repository: https://github.com/waittim/graph-fraud-detection  

This neural network project is an implementation of a GraphSAGE (Graph Neural Network) model for node classification on the CORA dataset. The project involves loading and preprocessing textual data, designing a GraphSAGE model architecture, and training the model to classify nodes into different categories. 

GitHub Repository: https://github.com/williamleif/graphsage-simple  

Traffic Prediction on the METR-LA Dataset

METR-LA dataset is a large-scale urban traffic speed prediction dataset containing real-time traffic speed data collected from loop detectors on the highways in Los Angeles County. The goal in this project is to use deep learning techniques for traffic forecasting in urban cities. by predicting the speed of vehicles at different locations and times. The project involves preprocessing and formatting time-series traffic data, designing a DCRNN architecture, and training the model to forecast future traffic flow. 

GitHub Repository: https://github.com/liyaguang/DCRNN  

Recurrent Neural Network (RNN) Projects on GitHub 

This section will present you a list of cool RNN projects.

Stock Price Prediction

This project uses a stacked LSTM neural network to predict the future stock prices of Google based on historical data. The model is trained on a dataset that includes the opening, closing, high, low, and volume of Google stock prices over a period of time. The code is written in Python and provides a practical example of how neural networks can be used for stock price prediction .

GitHub Repository: https://github.com/sonu275981/Google_Stock_Price_Prediction-And-Forecasting-Using-Stacked-LSTM  

Polyphonic Piano Transcription

If you think AI can’t assist in the music-making process, then hold you horses because this project is likely to blow your mind. This project aims to generate piano MIDI-files from audio pieces in different formats using neural networks. It uses a recurrent neural network (RNN) to process audio signals and predict the corresponding musical notes played on the piano. The project is a useful resource for individuals interested in audio signal processing and music transcription using neural networks.

GitHub Repository: https://github.com/BShakhovsky/PolyphonicPianoTranscription

Language Classifier

In this neural network project, you will build a language classification system that can classify text into the following five languages: English, Spanish, Finnish, Dutch, and Polish. The project is trained on a dataset that includes text samples from multiple languages, and the RNN model is used to predict the language of new text data. This project is a must for individuals interested in natural language processing and language classification using neural networks.

GitHub Repository: https://github.com/JasonFengGit/RNN-Language-Classifier  

What's the best way to learn Python? Work on these Machine Learning Projects in Python with Source Code to know about various libraries that are extremely useful in Data Science.

If you are looking for convolutional neural network projects for beginners, then check out the CNN project ideas mentioned in this section.

Anomaly Detection

This project aims to detect anomalous events in videos obtained from a camera mounted on a robot using deep learning techniques. The project uses a pre-trained convolutional neural network (CNN) to extract features from video frames. The model is trained on a dataset of normal and abnormal events, and is used to predict whether a new event in the video contains strange objects (which may hamper the path of the robot) or not. 

GitHub Repository: https://github.com/cbalkig/Anomaly_Detection_in_Videos  

English Character Recognition

This project uses a convolutional neural network to detect handwritten letters of the english language. It contains the code for building a character recognition system with the help of computer vision library, OpenCV, machine learning libraries- Scikit-learn along with deep learning libraries- Keras and TensorFlow .

GitHub Repository: https://github.com/MichaelSDavid/CharCNN  

Mask Detection

This project involves detecting whether a person is wearing a mask or not. The project involves pre-processing images of people and training a CNN model to classify them as masked or unmasked. The model is trained on a dataset of images containing people wearing masks or not. The project will also guide you on how to handle an imbalanced dataset.

GitHub Repository: https://github.com/yuhung1206/Detection-of-Mask-Wearing-using-CNN  

Neural networks and deep learning belong to the family of Artificial Intelligence and Machine Learning technologies . Several industries and businesses are already using these advanced technologies and concepts to gain an edge in the market. Industrial and business sectors, such as banking, retail, healthcare, marketing, manufacturing, etc. have been using AI-based systems and applications.

Studying neural networks and deep learning can take a lot of effort and the best way to master these skills is to Practice, Practice, and Practice. With the above  deep learning project ideas , you can explore the world of neural networks and can come up with innovative systems. You can combine neural networks with other latest technologies and concepts to attain an improved user experience. The project ideas we have proposed are significantly applicable and will engage a wider audience.

Access Data Science and Machine Learning Project Code Examples

1. What is a neural network project?

A neural network project involves the design, implementation, and training of artificial neural networks to solve specific problems or tasks. Neural networks are a type of machine learning model that can learn from large datasets to make predictions or classifications, and can be applied to various fields such as computer vision, natural language processing, and robotics. Neural network projects often involve complex algorithms, data preprocessing, and hyperparameter tuning to achieve optimal performance.

2. What are 3 examples of neural network?

Three examples of neural networks include convolutional neural networks (CNNs) used for image recognition and classification, recurrent neural networks (RNNs) used for natural language processing and sequence prediction, and generative adversarial networks (GANs) used for generating realistic images and data.

3. What are 5 examples of use of neural networks in our everyday life?

Five examples of the use of neural networks in our everyday life include:

Voice assistants like Siri and Alexa

Image recognition in social media and security systems

Personalized recommendations in online shopping and streaming services

Language translation in chatbots and language learning apps

Fraud detection in financial transactions.

Access Solved Big Data and Data Science Projects

About the Author

author profile

A computer science graduate with over four years of writing experience in various fields. His passion for technology and knack for clear communication enables him to simplify complex topics for readers. Fun fact: Badr has a mixed-breed dog named

arrow link

© 2024

© 2024 Iconiq Inc.

Privacy policy

User policy

Write for ProjectPro

10 Types of Neural Networks, Explained

HackerRank AI Promotion

Neural networks have become a driving force in the world of machine learning, enabling us to make significant strides in fields like speech recognition, image processing, and even medical diagnosis. This technology has evolved rapidly over the past few years, allowing us to develop powerful systems that can mimic the way our brains process information.

The impact of neural networks is being felt across countless industries, from healthcare to finance to marketing. They’re helping us solve complex problems in new and innovative ways, and yet we’ve only scratched the surface of what neural networks can do.

But not all neural networks are built alike. In fact, neural networks can take many different shapes and forms, and each is uniquely positioned to tackle different problems and types of data. Here, we’ll explore some of the different types of neural networks, explain how they work, and provide insight into their real-world applications. 

What are Neural Networks?

Before we dive into the types of neural networks, it’s essential to understand what neural networks are. 

A sub-discipline of deep learning, neural networks are complex computational models that are designed to imitate the structure and function of the human brain. These models are composed of many interconnected nodes — called neurons — that process and transmit information. With the ability to learn patterns and relationships from large datasets, neural networks enable the creation of algorithms that can recognize images, translate languages, and even predict future outcomes.

Neural networks are often referred to as a black box because their inner workings are often opaque. We don’t always know how all the individual neurons work together to arrive at the final output. You feed data into it — anything from images to text to numerical data — and the neural network processes that data through its interconnected neurons. The output could be anything from a prediction about the input to a classification of the input, based on the data that was fed into the network. 

Neural networks are especially adept at recognizing patterns, and this makes them incredibly useful for solving complex problems that involve large amounts of data. They can be used to make stock market predictions , analyze X-rays and CT scans, and even forecast the weather . 

How Do Neural Networks Work?

Neural networks are designed to learn from data, which means that they improve their performance over time as they are exposed to more data. This process of learning is called training, and it involves adjusting the weights and biases of the neurons in the network to minimize the error between the predicted output and the actual output.

Weights and Biases

In neural networks, weights and biases are numerical values assigned to each neuron, which help the network make predictions or decisions based on input data.

Imagine you’re trying to predict whether someone will like a certain movie based on their age and gender. In a neural network, each neuron in the input layer represents a different piece of information about the person, such as their age and gender. These neurons then pass their information to the next layer, where each neuron has a weight assigned to it that represents how important that particular input is for making the prediction.

For example, let’s say the network determines that age is more important than gender in predicting movie preferences. In this case, the age neuron would have a higher weight than the gender neuron, indicating that the network should pay more attention to age when making predictions.

Biases in a neural network are similar to weights, but they’re added to each neuron before the activation function — which decides how much of the inputs from the previous layer of neurons should be passed on to the next layer — is applied. Think of a bias as a sort of “default” value for a neuron — it helps the network adjust its predictions based on the overall tendencies of the data it’s processing.

For example, if the network is trying to predict whether someone will like a movie based on their age and gender, and it has seen that women generally tend to like romantic comedies more than men, it might adjust its predictions by adding a positive bias to the output of the gender neuron when it’s processing data from women. This would essentially tell the network to “expect” women to be more likely to like romantic comedies, based on what it has learned from the data.

How Neural Networks Are Structured

The basic structure of a neural network consists of three layers: the input layer, the hidden layer(s), and the output layer. The input layer is where the data is fed into the network, and the output layer is where the network outputs its prediction or decision.

The hidden layer(s) are where most of the computation in the network takes place. Each neuron in the hidden layer is connected to every neuron in the previous layer, and the weights and biases of these connections are adjusted during training to improve the performance of the network.

The number of hidden layers and neurons in each layer can vary depending on the complexity of the problem and the amount of data available. Deep neural networks, which have multiple hidden layers, have been shown to be particularly effective for complex tasks such as image recognition and natural language processing.

Types of Neural Networks

Neural networks can take many different forms, each with their own unique structure and function. In this section, we will explore some of the most common types of neural networks and their applications.

Feedforward Neural Networks

Feedforward neural networks are the most basic type of neural network. They consist of an input layer, one or more hidden layers, and an output layer. The data flows through the network in a forward direction, from the input layer to the output layer.

Feedforward neural networks are widely used for a variety of tasks, including image and speech recognition, natural language processing, and predictive modeling. For example, a feedforward neural network could be used to predict the likelihood of a customer churning based on their past behavior.

In a feedforward neural network, the input data is passed through the network, and each neuron in the hidden layer(s) performs a weighted sum of the inputs, applies an activation function, and passes the output to the next layer. The weights and biases of the neurons are adjusted during training to minimize the error between the predicted output and the actual output.

The perceptron is one of the earliest types of neural networks and was first implemented in 1958 by Frank Rosenblatt. It is a single-layer neural network that takes a set of inputs, processes them, and produces an output. 

Perceptrons can be used for a range of tasks, including image recognition, signal processing, and control systems. However, one drawback of these neural networks is that they can only solve problems where the data can be separated into two categories using a straight line — known as a linearly separable problem — limiting the network’s ability to solve more complex problems. 

Perceptrons work by applying weights to the input data and then summing them up. The sum is then passed through an activation function to produce an output. The activation function is typically a threshold function that outputs a 1 or 0 depending on whether the sum is above or below a certain threshold.  

Multilayer Perceptron

The Multilayer Perceptron (MLP) is a type of neural network that contains multiple layers of perceptrons. MLPs are a type of feedforward neural network and are commonly used for classification tasks.

Each layer in an MLP consists of multiple perceptrons, and the output of one layer is fed into the next layer as input. The input layer receives the raw data, and the output layer produces the final prediction. The hidden layers in between are responsible for transforming the input into a form that is suitable for the output layer.

Some applications of MLPs include image recognition, speech recognition, time series analysis, and natural language processing. 

Recurrent Neural Networks

Recurrent neural networks (RNNs) are a type of neural network that are designed for processing sequential data, such as text and speech. They are made up of recurrent neurons, which allow the network to maintain a “memory” of previous inputs.

RNNs are commonly used for natural language processing tasks, such as language translation and text generation. They can also be used for speech recognition and time series prediction. For example, an RNN could be used to generate a new sentence based on a given input sentence.

In an RNN, the input data is processed through a series of recurrent neurons, which take the current input and the output from the previous time step as input. This allows the network to maintain a memory of previous inputs and context. The weights and biases of the neurons are adjusted during training to minimize the error between the predicted output and the actual output — a process called backpropagation .

LSTM – Long Short-Term Memory

Long Short-Term Memory (LSTM) is a type of recurrent neural network (RNN) that is designed to handle long-term dependencies. It is composed of memory cells, input gates, output gates, and forget gates.

LSTM networks are used in natural language processing tasks, such as speech recognition, text translation, and sentiment analysis. They are also used in the field of image recognition, where they are used to recognize objects and scenes within an image.

LSTM networks work by allowing information to flow through the memory cells over time. The input gate determines which information should be stored in the memory cells, while the forget gate determines which information should be removed. The output gate then determines which information should be passed on to the next layer. This allows the network to remember important information over long periods of time and to selectively forget irrelevant information.

LSTM networks have proven to be very effective in solving problems with long-term dependencies and are widely used in the field of natural language processing. They are also used in speech recognition, handwriting recognition, and other applications where long-term memory is important.

Radial Basis Functional Neural Network

A Radial Basis Function (RBF) neural network is another type of feedforward neural network that uses a set of radial basis functions to transform its inputs into outputs. Like many neural networks, it is composed of three layers: the input layer, the hidden layer, and the output layer.

RBF networks are commonly used for pattern recognition, classification, and control tasks. One of the most popular applications of RBF networks is in the field of image recognition, where they are used to identify objects within an image.

The RBF network works by first transforming the input data using a set of radial basis functions. These functions calculate the distance between the input and a set of predefined centers in the hidden layer. The outputs from the hidden layer are then combined linearly to produce the final output. The weights of the connections between the hidden layer and the output layer are trained using a supervised learning algorithm, such as backpropagation .

RBF networks are often used for problems with large datasets because they can learn to generalize well and provide good predictions. They are also used for time-series analysis and prediction, as well as financial forecasting.

Convolutional Neural Networks

Convolutional neural networks (CNNs) are a type of neural network that are designed for processing grid-like data, such as images. They are made up of multiple layers, including convolutional layers, pooling layers, and fully-connected layers, which each playing a different and interconnected part in processing data and simplifying outputs.

CNNs are commonly used for image and video recognition tasks, such as object detection, facial recognition, and self-driving cars. For example, a CNN could be used to classify images of cats and dogs based on their features.

In a CNN, the input data is processed through multiple convolutional layers, which apply filters to the input and extract features. The output of the convolutional layers is then passed through pooling layers, which downsample the data and reduce its dimensionality. Finally, the output is passed through fully connected layers, which perform the final classification or prediction.

Autoencoder Neural Networks

Autoencoder neural networks are a type of neural network that is used for unsupervised learning, which means that they do not require labeled data to make predictions. They are primarily used for data compression and feature extraction.

Autoencoder neural networks work by compressing the input data into a lower-dimensional representation and then reconstructing it back into the original format. This allows them to identify the most important features of the input data.

Autoencoder neural networks are commonly used in applications such as data compression, image denoising, and anomaly detection. For example, NASA uses an autoencoder algorithm to detect anomalies in spacecraft sensor data.

Sequence to Sequence Models

Sequence to sequence (Seq2Seq) models are a type of neural network that uses deep learning techniques to enable machines to understand and generate natural language. They consist of an encoder and a decoder, which convert one sequence of data into another. This type of network is often used in machine translation, summarization, and conversation systems.

One of the most common applications of Seq2Seq models is machine translation, where the encoder takes the source language and converts it into a vector representation, which the decoder then uses to generate the corresponding text in the target language. Seq2Seq models have been used to develop state-of-the-art machine translation systems, such as Google Translate and DeepL.

Another application of Seq2Seq models is in summarization, where the encoder takes a long document and generates a shorter summary. These models have also been used in chatbots and other conversational agents to generate responses to user input.

Seq2Seq models work by first encoding the input sequence into a fixed-length vector representation, which captures the meaning of the sequence. The decoder then uses this vector to generate the output sequence one element at a time, predicting the next element based on the previous one and the context vector.

Modular Neural Network

Modular neural networks (MNN) are a type of neural network that allows multiple networks to be combined and work together to solve complex problems. In a modular network, each module is a separate network that is designed to solve a specific subproblem. The outputs from each module are then combined to provide a final output.

MNNs have been used to solve a wide range of complex problems, including computer vision, speech recognition, and robotics. For example, in computer vision, a modular network may be used to detect different objects in an image, with each module responsible for detecting a specific type of object. The outputs from each module are then combined to provide a final classification of the image.

One advantage of MNNs is that they allow for flexibility and modularity in the design of neural networks, making it easier to build complex systems by combining simpler modules. This makes it possible to develop large-scale systems with multiple modules, each solving a specific subproblem.

Another advantage of MNNs is that they can be more robust than traditional neural networks, as each module can be designed to handle a specific type of input or noise. This means that even if one module fails, the overall system can still function, as other modules can take over.

Key Takeaways

As technology continues to evolve, the use of neural networks is becoming increasingly important in the tech industry, and the demand for professionals with machine learning skills is growing rapidly. To learn more about the skills and competencies needed to excel in machine learning, check out HackerRank’s role directory and explore our library of up-to-date resources.

This was written with the help of AI. Can you tell which parts?

Get started with HackerRank

Over 2,500 companies and 40% of developers worldwide use HackerRank to hire tech talent and sharpen their skills.

Enhancing Breast Cancer Detection Through a Tailored Convolutional Neural Network Deep Learning Approach

  • Original Research
  • Published: 27 August 2024
  • Volume 5 , article number  826 , ( 2024 )

Cite this article

research topics on neural network

  • Job Prasanth Kumar Chinta Kunta 1 &
  • Vijayalakshmi A. Lepakshi 1  

Round the globe, the common form of malignancy dreadful disease settling in women is breast cancer. The disparate impact of breast cancer in women makes it an enthralling topic for research studies. Besides, the prediction of breast cancer is imperative and essential to conceive a clear schema of therapy and personalized medication. Moreover, breast cancer has a profound emotional and psychological impact on patients and their support networks, emphasizing the need for comprehensive care and support. As a result of the disease’s costs of diagnosis, treatment, and lost productivity, healthcare systems and economies are burdened due to the high costs of therapy. This paper contributes to developing a deep learning model, Custom CNN to categorically classify breast cancer images into benign and malignant forms by analyzing the histopathological images of breakHis dataset. Over and above, the results obtained are further compared with the additional pre-trained models MobileNetV3, EfficientNETB1, VGG16, and ResNet50V2 with the same dataset. Among them, Custom CNN stimulated an accuracy’s ultimate common form of malignancy of 92% which outperforms the other CNN models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save.

  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime

Price includes VAT (Russian Federation)

Instant access to the full article PDF.

Rent this article via DeepDyve

Institutional subscriptions

research topics on neural network

Explore related subjects

  • Artificial Intelligence
  • Medical Imaging

Data Availability

The corresponding author can provide access to the dataset generated and analyzed in the current study upon reasonable request.

Sun YS, et al. Risk factors and preventions of breast cancer. Int J Biol Sci. 2017;13(11):1387–97. https://doi.org/10.7150/ijbs.21635 .

Article   Google Scholar  

International Association for Pattern Recognition and R. and N. C. Mexican Association for Computer Vision. In: 2016 23rd International conference on pattern recognition (ICPR), 4–8 Dec. 2016.

Han Z, Wei B, Zheng Y, Yin Y, Li K, Li S. Breast cancer multi-classification from histopathological images with structured deep learning model. Sci Rep. 2017. https://doi.org/10.1038/s41598-017-04075-z .

Sharma GN, Dave R, Sanadya J, Sharma P, Sharma KK. Various types and management of breast cancer: an overview. J Adv Pharm Technol Res. 2010;1(2):109.

Egorov V, et al. Differentiation of benign and malignant breast lesions by mechanical imaging. Breast Cancer Res Treat. 2009;118(1):67–80. https://doi.org/10.1007/s10549-009-0369-2 .

Article   MathSciNet   Google Scholar  

Araújo T, Aresta G, Castro E, Rouco J, Aguiar P, Eloy C, Polónia A, Campilho A. Classification of breast cancer histology images using convolutional neural networks. PLoS ONE. 2017;12(6): e0177544. https://doi.org/10.1371/journal.pone.0177544 .

Gour M, Jain S, Sunil Kumar T. Residual learning based CNN for breast cancer histopathological image classification. Int J Imaging Syst Technol. 2020;30(3):621–35. https://doi.org/10.1002/ima.22403 .

Sahiner B, et al. Malignant and benign breast masses on 3D US volumetric images: effect of computer-aided diagnosis on radiologist accuracy. Radiology. 2007;242(3):716–24. https://doi.org/10.1148/radiol.2423051464 .

Khan SU, Islam N, Jan Z, Ud Din I, Rodrigues JJPC. A novel deep learning based framework for the detection and classification of breast cancer using transfer learning. Pattern Recognit Lett. 2019;125:1–6. https://doi.org/10.1016/j.patrec.2019.03.022 .

Saber A, Sakr M, Abo-Seida OM, Keshk A, Chen H. A novel deep-learning model for automatic detection and classification of breast cancer using the transfer-learning technique. IEEE Access. 2021;9:71194–209. https://doi.org/10.1109/ACCESS.2021.3079204 .

Kowal M, Filipczuk P, Obuchowicz A, Korbicz J, Monczak R. Computer-aided diagnosis of breast cancer based on fine needle biopsy microscopic images. Comput Biol Med. 2013;43(10):1563–72. https://doi.org/10.1016/j.compbiomed.2013.08.003 .

Kim JY, et al. Deep learning-based prediction model for breast cancer recurrence using adjuvant breast cancer cohort in Tertiary Cancer Center Registry. Front Oncol. 2021. https://doi.org/10.3389/fonc.2021.596364 .

Spanhol F, Oliveira L, Petitjean C, Heutte L. A dataset for breast cancer histopathological image classification. IEEE Trans Biomed Eng (TBME). 2016;63(7):1455–62.

Shachar SS, et al. Biopsy of breast cancer metastases: patient characteristics and survival. BMC Cancer. 2017. https://doi.org/10.1186/s12885-016-3014-6 .

Zheng H, Institute of Electrical and Electronics Engineers. In: Proceedings: 2014 IEEE international conference on bioinformatics and biomedicine, 2–5 November 2014, Belfast, UK.

Campilho A, Karray F, ter Haar Romeny B. Classification of breast cancer histology using deep learning, vol. 10882. Lecture notes in computer science image analysis and recognition. 2018. pp. 837–844. https://doi.org/10.1007/978-3-319-93000-8 , https://doi.org/10.1007/978-3-319-93000-8_95 (Chapter 95)

Khan S, Islam N, Jan Z, Din IU, Rodrigues JJC. A novel deep learning-based framework for the detection and classification of breast cancer using transfer learning. Pattern Recognit Lett. 2019;125:1–6.

Sharma S, Mehra R. Conventional machine learning and deep learning approach for multi-classification of breast cancer histopathology images—a comparative insight. J Digit Imaging. 2020. https://doi.org/10.1007/s10278-019-00307-y .

Jiang Y, Chen L, Zhang H, Xiao X. Breast cancer histopathological image classification using convolutional neural networks with small SE-ResNet module. PLoS ONE. 2019. https://doi.org/10.1371/journal.pone.0214587 .

Hameed Z, Zahia S, Garcia-Zapirain B, Aguirre JJ, Vanegas AM. Breast cancer histopathology image classification using an ensemble of deep learning models. Sensors (Switzerland). 2020;20(16):1–17. https://doi.org/10.3390/s20164373 .

Institute of Electrical and Electronics Engineers. In: 2017 the 2nd IEEE international conference on cloud computing and big data analysis (ICCCBDA 2017), April 28–30, 2017, Chengdu, China.

Yari Y, Nguyen TV, Nguyen HT. Deep learning applied for histological diagnosis of breast cancer. IEEE Access. 2020;8:162432–48. https://doi.org/10.1109/ACCESS.2020.3021557 .

Yamlome P, Akwaboah AD, Marz A, Deo M. Convolutional neural network based breast cancer histopathology image classification. Annu Int Conf IEEE Eng Med Biol Soc. 2020;2020:1144–7. https://doi.org/10.1109/EMBC44109.2020.9176594 .

Lepakshi VA. Machine learning and deep learning based AI tools for development of diagnostic tools. In: Computational approaches for novel therapeutic and diagnostic designing to mitigate SARS-CoV2 infection: revolutionary strategies to combat pandemics, pp. 399–420. Elsevier; 2022. https://doi.org/10.1016/B978-0-323-91172-6.00011-X .

Jaroensri R, Wulczyn E, Hegde N, Brown T, Tan F, Cai Y, Nagpal K, Rakha EA, Dabbs DJ, Olson N, Wren JH, Thompson EE, Seetao E, Robinson C, Miao M, Beckers F, Corrado GS, Peng LH, Mermel CH, et al. Deep learning models for histologic grading of breast cancer and association with disease prognosis. Npj Breast Cancer. 2022;8(1):1–12. https://doi.org/10.1038/s41523-022-00478-y .

Download references

Acknowledgements

The authors acknowledged the REVA University, Bangalore, Karnataka, India for supporting the research work by providing the facilities.

No funding received for this research.

Author information

Authors and affiliations.

School of Computer Science and Applications, REVA University, Rukmini Knowledge Park, Kattigenahalli, Yelahanka, Bangalore, 560064, Karnataka, India

Job Prasanth Kumar Chinta Kunta & Vijayalakshmi A. Lepakshi

You can also search for this author in PubMed   Google Scholar

Contributions

The research outcomes were significantly shaped by the collaborative efforts and collective contributions of all authors involved in this endeavor.

Corresponding author

Correspondence to Job Prasanth Kumar Chinta Kunta .

Ethics declarations

Conflict of interest.

No conflict of interest.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Kunta, J.P.K.C., Lepakshi, V.A. Enhancing Breast Cancer Detection Through a Tailored Convolutional Neural Network Deep Learning Approach. SN COMPUT. SCI. 5 , 826 (2024). https://doi.org/10.1007/s42979-024-03197-2

Download citation

Received : 02 July 2024

Accepted : 03 August 2024

Published : 27 August 2024

DOI : https://doi.org/10.1007/s42979-024-03197-2

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • MobileNetV3
  • EfficientNETB1

Advertisement

  • Find a journal
  • Publish with us
  • Track your research

These computer science terms are often used interchangeably, but what differences make each a unique technology?

Technology is becoming more embedded in our daily lives by the minute. To keep up with the pace of consumer expectations, companies are relying more heavily on machine learning algorithms to make things easier. You can see its application in social media (through object recognition in photos) or in talking directly to devices (such as Alexa or Siri).

While  artificial intelligence  (AI),  machine learning  (ML),  deep learning  and  neural networks  are related technologies, the terms are often used interchangeably, which frequently leads to confusion about their differences. This blog post clarifies some of the ambiguity.

The easiest way to think about AI, machine learning, deep learning and neural networks is to think of them as a series of AI systems from largest to smallest, each encompassing the next.

AI is the overarching system. Machine learning is a subset of AI. Deep learning is a subfield of machine learning, and neural networks make up the backbone of deep learning algorithms. It’s the number of node layers, or depth, of neural networks that distinguishes a single neural network from a deep learning algorithm, which must have more than three.

Artificial intelligence or AI, the broadest term of the three, is used to classify machines that mimic human intelligence and human cognitive functions like problem-solving and learning. AI uses predictions and automation to optimize and solve complex tasks that humans have historically done, such as facial and speech recognition, decision-making and translation.

Categories of AI

The three main categories of AI are:

  • Artificial Narrow Intelligence (ANI)
  • Artificial General Intelligence (AGI)
  • Artificial Super Intelligence (ASI)

ANI is considered “weak” AI, whereas the other two types are classified as “strong” AI. We define weak AI by its ability to complete a specific task, like winning a chess game or identifying a particular individual in a series of photos. Natural language processing and computer vision, which let companies automate tasks and underpin  chatbots  and virtual assistants such as Siri and Alexa, are examples of ANI. Computer vision is a factor in the development of self-driving cars.

Stronger forms of AI, like AGI and ASI, incorporate human behaviors more prominently, such as the ability to interpret tone and emotion. Strong AI is defined by its ability compared to humans. AGI would perform on par with another human, while ASI—also known as superintelligence—would surpass a human’s intelligence and ability. Neither form of Strong AI exists yet, but research in this field is ongoing.

Using AI for business

An increasing number of businesses, about  35%  globally, are using AI, and another 42% are exploring the technology. The development of  generative AI , which uses powerful foundation models that train on large amounts of unlabeled data, can be adapted to new use cases and bring flexibility and scalability that is likely to accelerate the adoption of AI significantly. In early tests, IBM has seen generative AI bring time to value up to 70% faster than traditional AI.

Whether you use AI applications based on ML or foundation models, AI can give your business a competitive advantage. Integrating customized AI models into your workflows and systems, and automating functions such as customer service, supply chain management and cybersecurity, can help a business meet customers’ expectations, both today and as they increase in the future.

The key is identifying the right data sets from the start to help ensure that you use quality data to achieve the most substantial competitive advantage. You’ll also need to create a hybrid, AI-ready architecture that can successfully use data wherever it lives—on mainframes, data centers, in private and public clouds and at the edge.

Your AI must be trustworthy because anything less means risking damage to a company’s reputation and bringing regulatory fines. Misleading models and those containing bias or that  hallucinate  (link resides outside ibm.com) can come at a high cost to customers’ privacy, data rights and trust. Your AI must be explainable, fair and transparent.

Machine learning is a subset of AI that allows for optimization. When set up correctly, it helps you make predictions that minimize the errors that arise from merely guessing. For example, companies like Amazon use machine learning to recommend products to a specific customer based on what they’ve looked at and bought before.

Classic or “nondeep” machine learning depends on human intervention to allow a computer system to identify patterns, learn, perform specific tasks and provide accurate results. Human experts determine the hierarchy of features to understand the differences between data inputs, usually requiring more structured data to learn.

For example, let’s say I showed you a series of images of different types of fast food: “pizza,” “burger” and “taco.” A human expert working on those images would determine the characteristics distinguishing each picture as a specific fast food type. The bread in each food type might be a distinguishing feature. Alternatively, they might use labels, such as “pizza,” “burger” or “taco” to streamline the learning process through supervised learning.

While the subset of AI called deep machine learning can leverage labeled data sets to inform its algorithm in supervised learning, it doesn’t necessarily require a labeled data set. It can ingest unstructured data in its raw form (for example, text, images), and it can automatically determine the set of features that distinguish “pizza,” “burger” and “taco” from one another. As we generate more big data, data scientists use more machine learning. For a deeper dive into the differences between these approaches, check out  Supervised versus unsupervised learning: What’s the difference?

A third category of machine learning is reinforcement learning, where a computer learns by interacting with its surroundings and getting feedback (rewards or penalties) for its actions. And online learning is a type of ML where a data scientist updates the ML model as new data becomes available.

To learn more about machine learning, check out the following video:

As our article on  deep learning  explains, deep learning is a subset of machine learning. The primary difference between machine learning and deep learning is how each algorithm learns and how much data each type of algorithm uses.

Deep learning automates much of the feature extraction piece of the process, eliminating some of the manual human intervention required. It also enables the use of large data sets, earning the title of  scalable machine learning . That capability is exciting as we explore the use of unstructured data further, particularly since  over 80% of an organization’s data is estimated to be unstructured  (link resides outside ibm.com). 

Observing patterns in the data allows a deep-learning model to cluster inputs appropriately. Taking the same example from earlier, we might group pictures of pizzas, burgers and tacos into their respective categories based on the similarities or differences identified in the images. A deep-learning model requires more data points to improve accuracy, whereas a machine-learning model relies on less data given its underlying data structure. Enterprises generally use deep learning for more complex tasks, like virtual assistants or fraud detection.

Neural networks, also called artificial neural networks or simulated neural networks, are a subset of machine learning and are the backbone of deep learning algorithms. They are called “neural” because they mimic how neurons in the brain signal one another.

Neural networks are made up of node layers—an input layer, one or more hidden layers and an output layer. Each node is an artificial neuron that connects to the next, and each has a weight and threshold value. When one node’s output is above the threshold value, that node is activated and sends its data to the network’s next layer. If it’s below the threshold, no data passes along.

Training data teach neural networks and help improve their accuracy over time. Once the learning algorithms are fined-tuned, they become powerful computer science and AI tools because they allow us to quickly classify and cluster data. Using neural networks, speech and image recognition tasks can happen in minutes instead of the hours they take when done manually. Google’s search algorithm is a well-known example of a neural network.

As mentioned in the explanation of neural networks above, but worth noting more explicitly, the “deep” in deep learning refers to the depth of layers in a neural network. A neural network of more than three layers, including the inputs and the output, can be considered a deep-learning algorithm. That can be represented by the following diagram:

Most deep neural networks are feed-forward, meaning they only flow in one direction from input to output. However, you can also train your model through backpropagation, meaning moving in the opposite direction, from output to input. Backpropagation allows us to calculate and attribute the error that is associated with each neuron, allowing us to adjust and fit the algorithm appropriately.

While all these areas of AI can help streamline areas of your business and improve your customer experience, achieving AI goals can be challenging because you’ll first need to ensure that you have the right systems to construct learning algorithms to manage your data. Data management is more than merely building the models that you use for your business. You need a place to store your data and mechanisms for cleaning it and controlling for bias before you can start building anything.

At IBM we are combining the power of machine learning and artificial intelligence in our new studio for foundation models, generative AI and machine learning, IBM® watsonx.ai™. Subscribe to the Think Newsletter

Learn more about watsonx.ai

  • Search Menu
  • Sign in through your institution
  • Computer Science
  • Earth Sciences
  • Information Science
  • Life Sciences
  • Materials Science
  • Science Policy
  • Advance Access
  • Special Topics
  • Author Guidelines
  • Submission Site
  • Open Access Options
  • Self-Archiving Policy
  • Reasons to submit
  • About National Science Review
  • Editorial Board
  • Advertising and Corporate Services
  • Journals Career Network
  • Dispatch Dates
  • Journals on Oxford Academic
  • Books on Oxford Academic

Article Contents

Advances in neural architecture search *.

ORCID logo

  • Article contents
  • Figures & tables
  • Supplementary Data

Xin Wang, Wenwu Zhu, Advances in neural architecture search, National Science Review , 2024;, nwae282, https://doi.org/10.1093/nsr/nwae282

  • Permissions Icon Permissions

Automated Machine Learning, a.k.a. AutoML, has made remarkable success in automating the non-trivial process of designing machine learning models. Among the focal areas of AutoML, Neural Architecture Search (NAS) stands out, aiming to systematically explore the complex architecture space to discover the optimal neural architecture configurations without intensive manual interventions. NAS has demonstrated its capability of dramatic performance improvement across a large number of real-world tasks. The core components in NAS methodologies normally include i) defining the appropriate search space, ii) designing the right search strategy, and iii) developing the effective evaluation mechanism. Although early NAS endeavors are characterized via groundbreaking architecture designs, the imposed exorbitant computational demands prompt a shift towards more efficient paradigms such as weight sharing and evaluation estimation, etc. Concurrently, the introduction of specialized benchmarks has paved the way for standardized comparisons of NAS techniques. Notably, the adaptability of NAS is evidenced by its capability of extending to diverse datasets, including graphs, tabular data, and videos etc., each of which requires a tailored configuration. This paper delves into the multifaceted aspects of NAS, elaborating its recent advances, applications, tools, benchmarks, and prospective research directions.

The research for this article was financed by the National Key Research and Development Program of China No.2023YFF1205001, National Natural Science Foundation of China (No. 62222209, 62250008, 62102222), Beijing National Research Center for Information Science and Technology under Grant No. BNR2023RC01003, BNR2023TD03006, and Beijing Key Lab of Networked Multimedia.

Email alerts

Citing articles via.

  • Recommend to Your Librarian

Affiliations

  • Online ISSN 2053-714X
  • Print ISSN 2095-5138
  • Copyright © 2024 China Science Publishing & Media Ltd. (Science Press)
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

arXiv's Accessibility Forum starts next month!

Help | Advanced Search

Computer Science > Machine Learning

Title: topic modelling meets deep neural networks: a survey.

Abstract: Topic modelling has been a successful technique for text analysis for almost twenty years. When topic modelling met deep neural networks, there emerged a new and increasingly popular research area, neural topic models, with over a hundred models developed and a wide range of applications in neural language understanding such as text generation, summarisation and language models. There is a need to summarise research developments and discuss open problems and future directions. In this paper, we provide a focused yet comprehensive overview of neural topic models for interested researchers in the AI community, so as to facilitate them to navigate and innovate in this fast-growing research area. To the best of our knowledge, ours is the first review focusing on this specific topic.
Comments: A review on Neural Topic Models
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Information Retrieval (cs.IR)
Cite as: [cs.LG]
  (or [cs.LG] for this version)
  Focus to learn more arXiv-issued DOI via DataCite

Submission history

Access paper:.

  • Other Formats

license icon

References & Citations

  • Google Scholar
  • Semantic Scholar

DBLP - CS Bibliography

Bibtex formatted citation.

BibSonomy logo

Bibliographic and Citation Tools

Code, data and media associated with this article, recommenders and search tools.

  • Institution

arXivLabs: experimental projects with community collaborators

arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.

Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.

Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs .

InterviewBit

Top 20 Deep Learning Projects With Source Code

What is deep learning, use of deep learning, deep learning projects for beginners, 1. image classification using cifar-10 dataset, 2. dog’s breed identification , 3. human face detection, 4. music genre classification system, intermediate deep learning projects, 5. drowsy driver detection system, 6. breast cancer detection ssing deep learning, 7. gender recognition using voice, 9. color detection system, 10. crop disease detection, advanced deep learning projects, 11. ocr (optical character reader) using yolo and tesseract for text extraction, 12. real-time image animation, 13. store item demand forecasting, 14. fake news detection project, 15. coloring old black and white photos, 16. human pose detection, 17. language translator using deep learning, 18. typing assistant, 19. hand gesture recognition system, 20. lane detection and assistance system, frequently asked questions, additional resources.

Despite being a relatively new scientific innovation, the scope of Deep Learning is rapidly expanding. The goal of this technology is to mimic the biological neural network of the human brain. Human brains have neurons that send and receive signals, forming the basis of Neural Networks. While Deep Learning has its roots in the 1950s, it was only recently brought to light by the growth and adoption of Artificial Intelligence and Machine Learning. If you’re new to machine learning, the best thing you can do is brainstorm Deep Learning project ideas. To assist you in your quest, we are going to suggest 20 Deep learning and Neural Network projects.

Deep learning refers to a class of machine learning techniques that employ numerous layers to extract higher-level features from raw data. Lower layers in image processing, for example, may recognize edges, whereas higher layers may identify human-relevant notions like numerals, letters, or faces. Deep learning uses artificial neural networks, which are supposed to mimic how humans think and learn, as opposed to machine learning, which uses simpler principles. Up until recently, the complexity of neural networks was constrained by processing capacity. Larger, more powerful neural networks are now possible thanks to advances in Big Data analytics, allowing computers to monitor, learn, and react to complicated events faster than people. Image categorization, language translation, and speech recognition have all benefited from deep learning. It can tackle any pattern recognition problem without the need for human intervention.

We could never have envisaged deep learning applications bringing us self-driving cars and virtual assistants like Alexa, Siri, and Google Assistant just a few years ago. However, these innovations are already a part of our daily lives. Deep Learning continues to fascinate us with its almost limitless applications, including fraud detection and pixel restoration. Apart from these, Deep learning finds its application in the following industries:

Confused about your next job?

  • Virtual assistants
  • Entertainment
  • Advertising
  • Customer experience
  • Computer vision
  • Language translation

In a real-time work environment, theoretical knowledge alone will not be sufficient. In this article, we’ll look at some fun deep learning project ideas that beginners, as well as experienced, can use to put their skills to the test. The projects covered in this article will serve those who want to get some hands-on experience with the technology. 20 projects along with their GitHub source code link are provided below.

In this project, you’ll create an image classification system that can determine the image’s class. Because image classification is such an important application in the field of deep learning, working on this project will allow you to learn about a variety of deep learning topics.

Working on image categorization is one of the finest ways to get started with hands-on deep learning projects for students. CIFAR-10 is a big dataset including approximately 60,000 color images (3232 sizes) divided into ten classes, each with 6,000 images. There are 50,000 photos in the training set and 10,000 images in the test set. The training set will be divided into five portions, each containing 10,000 photos that will be organized in random order. The test set will consist of 1000 photos selected at random from each of the ten classes.

Link to the source code

How frequently do you find yourself wondering about a dog’s breed name? There are numerous dog breeds, and most of them are very similar. Using the dog breeds dataset, we can create a model that can categorize different dog breeds based on an image. Dog lovers will benefit from this endeavor.

To implement this, a convolutional neural network is an obvious solution to an image recognition challenge. Unfortunately, due to the limited number of training examples, any CNN trained just on the provided training images would be highly overfitting. To overcome this, the developer used Resnet18’s transfer learning to give my model a head start and dramatically reduce training challenges. The model was able to be complex enough to accurately identify the dogs thanks to the deep structure.

Face detection is a computer vision problem that entails identifying people in photographs. It’s a simple difficulty for people to solve, and classical feature-based algorithms like the cascade classifier have done a good job at it. On typical benchmark face identification datasets, deep learning algorithms have recently attained state-of-the-art results. We can create models that detect the bounding boxes of the human face with excellent accuracy. This project will teach you how to detect any object in an image in general, and get you started with object detection.

This is an impressive deep learning project concept. You’ll build a deep learning model that employs neural networks to automatically classify music genres. The model takes as an input the spectogram of music frames and analyzes the image using a Convolutional Neural Network (CNN) plus a Recurrent Neural Network (RNN). The system’s output is a vector of the song’s projected genres. The model has been refined with a tiny sample (30 songs per genre) before testing it on the GTZAN dataset, resulting in an accuracy of 80%.

One of the leading causes of traffic accidents is driver drowsiness. It’s natural for drivers who travel long distances to fall asleep behind the wheel. Drivers might become tired while driving due to a variety of factors, including stress and lack of sleep. By developing a drowsy detection agent, our study hopes to avoid and reduce such accidents. You’ll use Python, OpenCV, and Keras to create a system that can detect drivers’ closed eyes and alarm them if they fall asleep behind the wheel. Even if the driver’s eyes are closed for a few seconds, this technology will alert the driver, preventing potentially fatal road accidents. We will use OpenCV to collect photos from a camera and feed them into a Deep Learning model that will classify whether the person’s eyes are ‘Open’ or ‘Closed’ in this project. For this project, we’ll take the following approach:

Step 1- Take an image from a camera as input.

Step 2 -Create a Region of Interest around the face in the image (ROI).

Step 3- Use the ROI to find the eyes and input them to the classifier.

Step 4- The classifier will determine whether the eyes are open. 

Step 5- Calculate the score to see if the person is sleepy.

Cancer is a severe disease that needs to be caught as soon as possible. Histopathology photos can be used to diagnose malignancy. Cancer cells differ from normal cells, therefore, we can use an image classification algorithm to identify the disease at the earliest. Deep Learning models have achieved a high level of accuracy in this field. The accuracy of the model depends upon the training data set provided to it.

Breast cancer is the most frequent cancer in women, and the most common type of breast cancer is invasive ductal carcinoma (IDC). Automated approaches can be utilized to save time and reduce errors for detecting and categorizing breast cancer subtypes, which is a crucial clinical activity. 

We can accurately determine a person’s gender by listening to their voice. Machines can also be taught to distinguish between male and female voices. We’ll need audio clips with male and female gender labels. The data is then fed into the classifying model using feature extraction techniques. The link to the source code of the project has been provided below. This project can be extended further to identify the mood of the speaker.

Making a chatbot using deep learning algorithms is another fantastic endeavor. Chatbots can be implemented in a variety of ways, and a smart chatbot will employ deep learning to recognize the context of the user’s question and then offer the appropriate response.

The project given below is a beginner’s walk-through tutorial on how to build a chatbot with deep learning, TensorFlow, and an NMT sequence-to-sequence model.

The project given below can predict up to 11 Distinct Color Classes based on the RGB input by users from the sliders. Red, Green, Blue, Yellow, Orange, Pink, Purple, Brown, Grey, Black, and White are the 11 classes. R: Red, G: Green, B: Blue; Each of which is basically an integer ranging from 0 to 255; and these combined Red, Green, and Blue values are utilized to form a distinct Solid Color for every pixel on the computer, mobile, or any electronic screen. This Classifier predicts the solid color’s color class. Also, the color dataset has been humanly developed to make the artificial model(classifier) classify the colors as humanly as possible.

When it comes to using technology in agriculture, one of the most perplexing issues is plant disease detection. Despite the fact that research has been done to determine whether a plant is healthy or diseased utilizing Deep Learning and Neural Networks, new technologies are continually being developed.

You must create a model that uses RGB photos to forecast illnesses in crops for this assignment. Convolutional Neural Networks (CNN) are utilized to create a crop disease detection model. CNN uses an image to identify and detect sickness. In a Convolutional Neural Network, there are several steps. These are the steps:

  • Operation of Convolution.
  • Layer of ReLU
  • Pooling. 
  • Flattening.
  • Full connection

Extracting information from any document is a difficult operation that requires object classification and object localization. In many financial, accounting, and taxation fields, OCR digitization addresses the difficulty of automatically extracting, which plays a significant role in speeding document-intensive operations and office automation.

This custom OCR combines YOLO and Tesseract to read the contents of a Lab Report and convert it to an editable format. In this case, the developer of the project has utilized YOLO V3 that was trained on a personal dataset. The coordinates of the discovered objects are then supplied to cropping and storing the detected objects in another list. To get the required output, this list is fed into the Tesseract.

This is an open-source computer vision project. You must use OpenCV to accomplish real-time image animation in this project.  The model modifies the image expression to match the expression of the person in front of the camera.

Using this repository, you will be able to make face image animations using a real-time camera image of your face, from a webcam animation or, if you already have a video of your face, you may use that to make face image animations. This assignment is particularly valuable if you aim to work in the fashion, retail, or advertising industries. This project’s code is available on GitHub.

Building a forecasting model to estimate store item demand is difficult due to numerous external factors such as the store’s location, seasonality, changes in the store’s neighborhood or competitive position, a considerable variance in the number of consumers and goods, and so on. With such a large volume of data, no human planner could possibly examine all of the possible elements. Deep learning, on the other hand, makes it easier by taking these characteristics into account at a finer level, by individual store or fulfillment channel.

Consumers can now get the most up-to-date news at their fingertips thanks to the digital age of mobile applications. But, are the things we read on these sites always accurate? No, that is not the case. Take, for example, our favorite chat application WhatsApp in real-time. You would have gotten a lot of notifications about how to cure and prevent the COVID-19 virus. These messages are frequently fraudulent, and the terrible aspect is that many people believe them and even follow them, which has led to some dangerous outcomes. AI is being used by companies such as Facebook, Google, and others to detect and remove false news from their platforms.

There are a variety of approaches for attaining this goal, but the goal of this effort is to identify the fishy ones solely by glancing at the text. There are no graphs, social network analysis, or photos. Three deep learning architectures are presented in this paper and then tested on two datasets (the fake news corpus and the TI-CNN), yielding state-of-the-art results.

  • LSTM (Long Short Term Memory) Based architecture
  • CNN (Convolutional Neural Network) Based architecture
  • BERT (Bidirectional Encoder Representations from transformers) Based architecture

Automated picture colorization of black-and-white photos has become a prominent topic in computer vision and deep learning research. Image colorization takes a grayscale (black and white) image as an input and outputs a colorized version of an old movie image.  The output colorized film’s image should represent and match the semantic colors and tones of the input.

The network is built in four parts and gradually becomes more complex.

  • The alpha network deals with how an image is transformed into RGB pixel values and later translated into LAB pixel values, changing the color space. It also builds a core intuition for how the network learns. 
  • The network in the beta version is very similar to the alpha version. The difference is that we use more than one image to train the network. 
  • The full version adds information from a pre-trained classifier. You can think of the information as 20% nature, 30% humans, 30% sky, and 20% brick buildings. It then learns to combine that information with the black and white photo.
  • The GAN version uses Generative Adversarial Networks to make the coloring more consistent and vibrant. 

Humans are expressive beings. This project was developed using deep learning concepts and it  can detect the pose you make in front of the camera. Several methods for predicting Human Pose Estimation have been proposed. These algorithms frequently start by identifying the component parts, then understand the connections between them to estimate the pose. Activity Recognition, Motion Capture and Augmented Reality, Training Robots, and Motion Tracking for Consoles in the game industry are just a few of the real-world applications of knowing a person’s orientation.

Have you ever traveled to a new location and struggled to communicate in the native tongue? I’m sure you’ve tried to imitate the local language and accent with Google Translator at least once. Machine Translation (MT) is a popular topic of computer linguistics that focuses on translation from one language to another. NMT (Neural Machine Translation) has become the most effective method for performing this task as deep learning has grown in popularity and efficiency. We’ve all used Google Translator, which is the industry’s premier machine translation example. An NMT model’s main goal is to take a text input in any language and translate it into a different language as an output.

The developer of the current project has used RNN sequence-to-sequence learning in Keras to translate the English language to the French language.

Devices these days are capable of finishing our sentences even before we type them. Google began automatically finishing my sentence as soon as I started entering the title  “Auto text completion and creation with De…” It correctly predicted Deep Learning in this scenario! 

The project given below provides the ability to autocomplete words and predicts what the next word will be. This allows you to type faster, more intelligently, and with less effort.

The methodology used to implement the project is as follows:

  • Counting words in Corpora: Counting of things in NLP is based on a corpus. NLTK (Natural Language Toolkit) provides a diverse set of corpora.
  • N-Gram model: Probabilistic models are used to compute the likelihood of a complete sentence or to make a probabilistic prediction of the next word in a sequence. In this model, the conditional probability of a word is calculated based on the preceding words.
  • Bigram model: In this model, we approximate the probability of a word given all the previous words by the conditional probability of the preceding word.
  • Trigram model: A trigram model looks just the same as a bigram model, except that we condition on the two-previous words.
  • Minimum Edit Distance: The minimum edit distance between two strings is a measurement of how similar two strings are to one another.

Suppose you want to create a cool feature in a smart TV that recognizes five various gestures made by the user and allows them to operate the TV without using a remote.

The webcam positioned on the TV continuously monitors the movements. Each gesture is associated with a distinct command:

  • Increase the volume, please.
  • Reducing the volume is a no-no.
  • 10 seconds ‘Jump’ backward with the left swipe
  • ‘Jump’ forward 10 seconds with a right swipe
  • Stop: Put the movie on hold.

The project given below achieves that by using training data that consists of a few hundred videos categorized into one of the five classes. Each video (typically 2-3 seconds long) is divided into a sequence of 30 frames(images). These videos have been recorded by various people performing one of the five gestures in front of a webcam – similar to what the smart TV will use.

Automatic driving technology has advanced rapidly in recent years. One of the major concerns in the manufacturing of self-driving cars is the detection of the lane line. The given project is the implementation of lanenet model for real-time lane detection using a deep neural network model. In this project, you will implement a Deep Neural Network for real-time lane detection using TensorFlow, based on an IEEE IV conference article.  For a real-time lane detection task, this model includes an encoder-decoder stage, a binary semantic segmentation stage, and instance semantic segmentation using a discriminative loss function

We have collected 20 deep learning projects that you can develop to polish your skills and improve your portfolio. The technology is still in its infancy; it is continually evolving as we speak. Deep Learning has enormous potential for spawning ground-breaking ideas that can aid humanity in addressing some of the world’s most pressing issues.

How do I start a deep learning project? You can always start with small projects and then move on to tough ones once you are confident enough. You can also check out this free Deep Learning course to master the fundamentals of Deep Learning.

What is CNN deep learning? A Convolutional Neural Network (ConvNet/CNN) is a Deep Learning system that can take an input image, assign relevance (learnable weights and biases) to various aspects/objects in the image, and distinguish between them.

What is Keras API? Keras is a Python-based deep learning API that runs on top of TensorFlow, a machine learning platform. It was created with the goal of allowing for quick experimentation.

What is Kaggle used for? Kaggle is a website where you may share ideas, get inspired, compete against other data scientists, acquire new information and coding methods, and explore real-world data science applications.

  • Deep Learning Vs Machine Learning
  • Applications of Deep Learning
  • Deep Learning
  • Deep Learning Projects

Previous Post

15 exciting sql projects with source code, top 15 big data projects (with source code).

Research Topics in Neural Networks

In artificial intelligence and machine learning, Neural Networks-based research is a wide and consistently emerging field. The concepts include theoretical basics, methodological enhancements, novel frameworks and a broad range of applications. All the members in phdprojects.org are extremely cooperative and work tirelessly to get original and novel topics on your area. We offer dedicated help to provide meaningful project. Below, we discuss about various latest and evolving research concepts in neural networks:

Fundamental Research:

  • Neural Network Theory: Our research interprets the neural network’s in-depth theoretical factors such as ability, generalization capabilities and the reason for its robustness among several tasks.
  • Optimization Methods: To efficiently and appropriately train the neural networks, we create novel optimization techniques.
  • Neural Architecture Search (NAS): Machine learning assists us to discover the best network frameworks and effectively automate the neural network development process.
  • Quantum Neural Networks: We examine how quantum techniques improve efficiency of neural networks and analyze the intersection of neural networks and quantum computing.

Advances in Learning Techniques:

  • Meta-Learning: In meta-learning, our model learns how to learn and enhances its efficiency with every task with remembering the previously gained skills.
  • Federated Learning: By keeping the data confidentiality and safety, we explore the training of distributed neural networks throughout various devices.
  • Reinforcement Learning: To accomplish the aim, our approach enhances the methods that enable models to decide consecutive decisions by communicating with their platforms.
  • Few-shot or Semi-supervised Learning: This technique allows our neural network models to learn from a limited labeled dataset added with a huge unlabeled dataset.

Enhancing Neural Network Components:

  • Activation Functions: To enhance the efficiency and training variations of neural networks, we investigate various activation functions.
  • Dynamic & Adaptive Networks: This is about the development of neural networks that alter their design and dimension at the training process based on the difficult nature of the task.
  • Regularization Methods: To avoid overfitting issues and enhance the neural network’s generalization, we build novel regularization techniques.

Neural Network Efficiency:

  • Explainable AI (XAI): To make our model more clear and reliable, we improve the understandability of neural network decisions.
  • Adversarial Machine Learning: Our research explores the neural network’s safety factors, specifically its efficiency against adversarial assaults and creates protection.
  • Fault Tolerance in Neural Networks: Make sure whether our neural networks are robust even its aspects fail or data is modified.

New Architectures & Frameworks:

  • Capsule Networks: Approaching our capsule networks framework which intends to address the challenges of CNNs including its inefficiency in managing spatial hierarchies.
  • Spiking Neural Networks (SNN): We create neural frameworks that nearly copies the processing way of biological neurons and effectively guides to more robust AI frameworks.
  • Integrated frameworks: Our project integrates neural networks with statistical frameworks or machine learning to manipulate the effectiveness of both.

Neural Networks Applications:

  • Clinical Diagnosis: In clinical imaging and diagnosis such as radiology, pathology and genomics, we enhance the neural network’s utilization.
  • Climate Modeling: Neural networks support us to interpret the complicated climatic systems and improve the climate forecasting’s accuracy,
  • Automatic Systems: Our project intends to create neural networks to utilize in automatic drones, robots, and self-driving cars.
  • Neural Networks in Natural Language processing (NLP): For various tasks such as summarization, translation, question-answering and others, we employ the latest language frameworks.
  • Financial Modeling: Neural networks helpful for us to forecast market trends, evaluate severity and automate business.

Cross-disciplinary Concepts:

  • Bio-inspired Neural Networks: To develop more robust and effective neural network methods, we observe motivations from neuroscience.
  • Neural Networks for Social Good: For overcoming social limitations like disaster concerns, poverty consideration, or monitoring disease spread, our research uses a neural network approach.

Evolving Approaches:

  • AI for Creativity: For innovative tasks like creating arts, music, development and writing, we make use of neural networks.
  • Edge AI: The process of neural network optimization helps us to effectively execute our model on edge-based devices such as IoT devices or smartphones with a small amount of computational energy.

It is very significant for us to think about the accessible resources, our own knowledge and possible project effects while selecting research concepts. A novel research approach emerges through the association with business, integrative community and institution and it also offers potential applications for our project.

Research Projects in Neural Networks

What specific neural network architectures are being explored in the research thesis?

Neural Network Architecture operates by using organized layers to change input data into important depictions. The original layer obtains the unprocessed data, which then undergoes mathematical calculations within one or multiple hidden layers.

Convolutional Neural Networks (CNN) outshine in image recognition tasks, while Recurrent Neural Networks (RNN) prove superior performance in categorization calculation.

  • Global Asymptotical Stability of Recurrent Neural Networks With Multiple Discrete Delays and Distributed Delays
  • An Improved Algebraic Criterion for Global Exponential Stability of Recurrent Neural Networks With Time-Varying Delays
  • Finding Features for Real-Time Premature Ventricular Contraction Detection Using a Fuzzy Neural Network System
  • Improved Delay-Dependent Stability Condition of Discrete Recurrent Neural Networks With Time-Varying Delays
  • Experiments in the application of neural networks to rotating machine fault diagnosis
  • Flash-based programmable nonlinear capacitor for switched-capacitor implementations of neural networks
  • Polynomial functions can be realized by finite size multilayer feedforward neural networks
  • Convergence of Nonautonomous Cohen–Grossberg-Type Neural Networks With Variable Delays
  • Analysis and Optimization of Network Properties for Bionic Topology Hopfield Neural Network Using Gaussian-Distributed Small-World Rewiring Method
  • Comparing Support Vector Machines and Feedforward Neural Networks With Similar Hidden-Layer Weights
  • An artificial neural network study of the relationship between arousal, task difficulty and learning
  • Flow-Based Encrypted Network Traffic Classification With Graph Neural Networks
  • Deriving sufficient conditions for global asymptotic stability of delayed neural networks via nonsmooth analysis-II
  • Bifurcating pulsed neural networks, chaotic neural networks and parametric recursions: conciliating different frameworks in neuro-like computing
  • Prediction of internal surface roughness in drilling using three feedforward neural networks – a comparison
  • Comparison of two neural networks approaches to Boolean matrix factorization
  • A new class of convolutional neural networks (SICoNNets) and their application of face detection
  • The Guelph Darwin Project: the evolution of neural networks by genetic algorithms
  • Training neural networks with threshold activation functions and constrained integer weights
  • A commodity trading model based on a neural network-expert system hybrid
  • PHD Guidance
  • PHD PROJECTS UK
  • PHD ASSISTANCE IN BANGALORE
  • PHD Assistance
  • PHD In 3 Months
  • PHD Dissertation Help
  • PHD IN JAVA PROGRAMMING
  • PHD PROJECTS IN MATLAB
  • PHD PROJECTS IN RTOOL
  • PHD PROJECTS IN WEKA
  • PhD projects in computer networking
  • COMPUTER SCIENCE THESIS TOPICS FOR UNDERGRADUATES
  • PHD PROJECTS AUSTRALIA
  • PHD COMPANY
  • PhD THESIS STRUCTURE
  • PHD GUIDANCE HELP
  • PHD PROJECTS IN HADOOP
  • PHD PROJECTS IN OPENCV
  • PHD PROJECTS IN SCILAB
  • PHD PROJECTS IN WORDNET
  • NETWORKING PROJECTS FOR PHD
  • THESIS TOPICS FOR COMPUTER SCIENCE STUDENTS
  • IEEE JOURNALS IN COMPUTER SCIENCE
  • OPEN ACCESS JOURNALS IN COMPUTER SCIENCE
  • SCIENCE CITATION INDEX COMPUTER SCIENCE JOURNALS
  • SPRINGER JOURNALS IN COMPUTER SCIENCE
  • ELSEVIER JOURNALS IN COMPUTER SCIENCE
  • ACM JOURNALS IN COMPUTER SCIENCE
  • INTERNATIONAL JOURNALS FOR COMPUTER SCIENCE AND ENGINEERING
  • COMPUTER SCIENCE JOURNALS WITHOUT PUBLICATION FEE
  • SCIENCE CITATION INDEX EXPANDED JOURNALS LIST
  • THOMSON REUTERS INDEXED JOURNALS
  • DOAJ COMPUTER SCIENCE JOURNALS
  • SCOPUS INDEXED COMPUTER SCIENCE JOURNALS
  • SCI INDEXED COMPUTER SCIENCE JOURNALS
  • SPRINGER JOURNALS IN COMPUTER SCIENCE AND TECHNOLOGY
  • ISI INDEXED JOURNALS IN COMPUTER SCIENCE
  • PAID JOURNALS IN COMPUTER SCIENCE
  • NATIONAL JOURNALS IN COMPUTER SCIENCE AND ENGINEERING
  • MONTHLY JOURNALS IN COMPUTER SCIENCE
  • SCIMAGO JOURNALS LIST
  • THOMSON REUTERS INDEXED COMPUTER SCIENCE JOURNALS
  • RESEARCH PAPER FOR SALE
  • CHEAP PAPER WRITING SERVICE
  • RESEARCH PAPER ASSISTANCE
  • THESIS BUILDER
  • WRITING YOUR JOURNAL ARTICLE IN 12 WEEKS
  • WRITE MY PAPER FOR ME
  • PHD PAPER WRITING SERVICE
  • THESIS MAKER
  • THESIS HELPER
  • DISSERTATION HELP UK
  • DISSERTATION WRITERS UK
  • BUY DISSERTATION ONLINE
  • PHD THESIS WRITING SERVICES
  • DISSERTATION WRITING SERVICES UK
  • DISSERTATION WRITING HELP
  • PHD PROJECTS IN COMPUTER SCIENCE
  • DISSERTATION ASSISTANCE

This paper is in the following e-collection/theme issue:

Published on 23.8.2024 in Vol 26 (2024)

Digital Epidemiology of Prescription Drug References on X (Formerly Twitter): Neural Network Topic Modeling and Sentiment Analysis

Authors of this article:

Author Orcid Image

There are no citations yet available for this article according to Crossref .

IMAGES

  1. Introduction to Neural Networks with Scikit-Learn

    research topics on neural network

  2. A Guide to Deep Learning and Neural Networks

    research topics on neural network

  3. Neural network diagram

    research topics on neural network

  4. Latest Artificial Neural Network Research Topics [Novel Ideas]

    research topics on neural network

  5. Introduction to Neural Networks

    research topics on neural network

  6. Phd Research Topics in Deep Recurrent Neural Networks

    research topics on neural network

VIDEO

  1. Neural Network For School Students

  2. Neural Network: Models of artificial neural netwok

  3. Neural Networks [ 1- Introduction ]

  4. In Convolution Neural Network(CNN),There are total N records in the data, if we 5 filters

  5. Implementing Neural Networks & Backpropagation from Scratch in C

  6. Bacterial foraging optimization based Radial Basis Function Neural Network BRBFNN for identification

COMMENTS

  1. 13 Best Neural Network Project Ideas & Topics for Beginners [2024]

    Here are few examples of Neural network projects with source code. 1. Autoencoders based on neural networks. Autoencoders are the simplest of deep learning architectures. They are a specific type of feedforward neural networks where the input is first compressed into a lower-dimensional code.

  2. 17+ Neural Network Project Ideas for Beginners to Advanced

    Implement a neural network architecture such as a multilayer perceptron (MLP) or recurrent neural network (RNN) to classify emails as spam or effective. 6. Music Genre Classification. Explore audio data analysis by developing a neural network model to classify music into different genres, such as rock, jazz, or pop.

  3. Advanced Topics in Neural Networks

    An introduction to some advanced neural network topics such as snapshot ensembles, dropout, bias correction, and cyclical learning rates. This article will act as an introduction to some of the more advanced topics used in neural networks and will cover several important topics still discussed in neural network research.

  4. Advanced Topics in Deep Convolutional Neural Networks

    The network allows for the development of extremely deep neural networks, which can contain 100 layers or more. This is revolutionary since up to this point, the development of deep neural networks was inhibited by the vanishing gradient problem, which occurs when propagating and multiplying small gradients across a large number of layers.

  5. What is a Neural Network?

    Every neural network consists of layers of nodes, or artificial neurons—an input layer, one or more hidden layers, and an output layer. Each node connects to others, and has its own associated weight and threshold. If the output of any individual node is above the specified threshold value, that node is activated, sending data to the next ...

  6. Deep Learning: A Comprehensive Overview on Techniques ...

    Deep learning became a prominent topic after that, resulting in a rebirth in neural network research, hence, some times referred to as "new-generation neural networks". This is because deep networks, when properly trained, have produced significant success in a variety of classification and regression challenges [ 52 ].

  7. Neural Network Examples, Applications, and Use Cases

    A neural network is a machine learning system that attempts to mimic the way human intelligence works to power AI. It's structured using nodes arranged in layers that filter data and transfer information through the system to make connections. The network user creates an input, and the neural network delivers an output.

  8. A review of graph neural networks: concepts, architectures, techniques

    Graph neural network research evolution. Graph neural networks (GNNs) were first proposed in 2005, but only recently have they begun to gain traction. GNNs were first introduced by Gori [2005] and Scarselli [2004, 2009]. A node's attributes and connected nodes in the graph serve as its natural definitions.

  9. [2102.04906] Dynamic Neural Networks: A Survey

    Dynamic neural network is an emerging research topic in deep learning. Compared to static models which have fixed computational graphs and parameters at the inference stage, dynamic networks can adapt their structures or parameters to different inputs, leading to notable advantages in terms of accuracy, computational efficiency, adaptiveness, etc. In this survey, we comprehensively review this ...

  10. Neural Network Models in Autonomous Robotics

    This Research Topic aims to address these challenges by exploring innovative applications of neural network models in autonomous robotics. It seeks to highlight research that expands neural networks' role in enhancing robotic autonomy, decision-making, and adaptability. Contributions may include studies on energy-efficient neural network models ...

  11. Introduction to Neural Networks. A detailed overview of neural networks

    Future articles will go into more detailed topics about the design and optimization of neural networks and deep learning. ... The discovery of backpropagation is one of the most important milestones in the whole of neural network research. To propagate is to transmit something (e.g. light, sound) in a particular direction or through a ...

  12. These neural networks know what they're doing

    The new research draws on previous work in which Hasani and others showed how a brain-inspired type of deep learning system called a Neural Circuit Policy (NCP), built by liquid neural network cells, is able to autonomously control a self-driving vehicle, with a network of only 19 control neurons.

  13. The use of artificial neural networks in studying the ...

    Artificial neural networks are used to study glaucoma progression, designed through successive trials for near-optimal configurations using the NeuroSolutions and PyTorch frameworks.

  14. Best Neural Networks Courses Online with Certificates [2024]

    Explore top courses and programs in Neural Networks. Enhance your skills with expert-led lessons from industry leaders. ... computer research scientists earn a median annual salary of $122,840 per year, ... learn about neural networks, online or otherwise. You can take courses and Specializations spanning multiple courses in topics like neural ...

  15. Neural networks in the future of neuroscience research

    Neural networks in the future of neuroscience research. Nature Reviews Neuroscience 16 , 767 ( 2015) Cite this article. Neural networks are increasingly seen to supersede neurons as fundamental ...

  16. Neural networks: An overview of early research, current frameworks and

    1. Introduction and goals of neural-network research. Generally speaking, the development of artificial neural networks or models of neural networks arose from a double objective: firstly, to better understand the nervous system and secondly, to try to construct information processing systems inspired by natural, biological functions and thus gain the advantages of these systems.

  17. Information

    Recurrent neural networks (RNNs) have significantly advanced the field of machine learning (ML) by enabling the effective processing of sequential data. This paper provides a comprehensive review of RNNs and their applications, highlighting advancements in architectures, such as long short-term memory (LSTM) networks, gated recurrent units (GRUs), bidirectional LSTM (BiLSTM), echo state ...

  18. A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis

    To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024.

  19. 15+ Neural Network Projects Ideas for Beginners to Practice 2024

    Top 15+ Neural Network Projects Ideas for 2024. Before we delve into these simple projects to do in neural networks, it's significant to understand what exactly are neural networks.. Neural networks are changing the human-system interaction and are coming up with new and advanced mechanisms of problem-solving, data-driven predictions, and decision-making.

  20. 10 Types of Neural Networks, Explained

    Feedforward neural networks are the most basic type of neural network. They consist of an input layer, one or more hidden layers, and an output layer. The data flows through the network in a forward direction, from the input layer to the output layer. Feedforward neural networks are widely used for a variety of tasks, including image and speech ...

  21. Intermediate Topics in Neural Networks

    In this article, I will cover the design and optimization aspects of neural networks in detail. The topics in this article are: Anatomy of a neural network; Activation functions; ... day the workings of neural networks to produce highly complex abstractions are still seen as somewhat magical and is a topic of research in the deep learning ...

  22. Enhancing Breast Cancer Detection Through a Tailored ...

    The disparate impact of breast cancer in women makes it an enthralling topic for research studies. Besides, the prediction of breast cancer is imperative and essential to conceive a clear schema of therapy and personalized medication. ... A type of artificial neural network known as convolutional neural networks (CNNs) is especially effective ...

  23. Neural Network Research

    In subject area: Computer Science. Neural Network Research refers to the pursuit of accurate mathematical characterizations of the electrophysiological properties of individual neurons and interconnected networks, leading to the development of models for pattern recognition and other applications in engineering and medicine.

  24. AI vs. Machine Learning vs. Deep Learning vs. Neural Networks

    Neural networks, also called artificial neural networks or simulated neural networks, are a subset of machine learning and are the backbone of deep learning algorithms. They are called "neural" because they mimic how neurons in the brain signal one another. Neural networks are made up of node layers—an input layer, one or more hidden ...

  25. Advances in neural architecture search

    The research for this article was financed by the National Key Research and Development Program of China No.2023YFF1205001, National Natural Science Foundation of China (No. 62222209, 62250008, 62102222), Beijing National Research Center for Information Science and Technology under Grant No. BNR2023RC01003, BNR2023TD03006, and Beijing Key Lab ...

  26. Topic Modelling Meets Deep Neural Networks: A Survey

    Topic modelling has been a successful technique for text analysis for almost twenty years. When topic modelling met deep neural networks, there emerged a new and increasingly popular research area, neural topic models, with over a hundred models developed and a wide range of applications in neural language understanding such as text generation, summarisation and language models. There is a ...

  27. Top 20 Deep Learning Projects With Source Code

    Despite the fact that research has been done to determine whether a plant is healthy or diseased utilizing Deep Learning and Neural Networks, new technologies are continually being developed. ... Machine Translation (MT) is a popular topic of computer linguistics that focuses on translation from one language to another. NMT (Neural Machine ...

  28. Trending Research Topics & Ideas in Neural Networks

    Research Topics in Neural Networks. In artificial intelligence and machine learning, Neural Networks-based research is a wide and consistently emerging field. The concepts include theoretical basics, methodological enhancements, novel frameworks and a broad range of applications. All the members in phdprojects.org are extremely cooperative and ...

  29. Relation-Aware Heterogeneous Graph Neural Network for ...

    Graph Neural Networks (GNNs) have achieved excellent performance of graph representation learning and attracted plenty of attentions in recent years. ... is an important research topic which also ...

  30. Journal of Medical Internet Research

    In past research using latent Dirichlet allocation (LDA), we found that tweets containing "street names" of prescription drugs were difficult to classify due to the similarity to other colloquialisms and lack of clarity over how the terms were used. ... Neural Network Topic Modeling and Sentiment Analysis Authors of this article: Varun K ...