Distributed Systems and Parallel Computing

No matter how powerful individual computers become, there are still reasons to harness the power of multiple computational units, often spread across large geographic areas. Sometimes this is motivated by the need to collect data from widely dispersed locations (e.g., web pages from servers, or sensors for weather or traffic). Other times it is motivated by the need to perform enormous computations that simply cannot be done by a single CPU.

From our company’s beginning, Google has had to deal with both issues in our pursuit of organizing the world’s information and making it universally accessible and useful. We continue to face many exciting distributed systems and parallel computing challenges in areas such as concurrency control, fault tolerance, algorithmic efficiency, and communication. Some of our research involves answering fundamental theoretical questions, while other researchers and engineers are engaged in the construction of systems to operate at the largest possible scale, thanks to our hybrid research model .

Recent Publications

Some of our teams.

Algorithms & optimization

Graph mining

Network infrastructure

System performance

We're always looking for more talented, passionate people.

Careers

Information

  • Author Services

Initiatives

You are accessing a machine-readable page. In order to be human-readable, please install an RSS reader.

All articles published by MDPI are made immediately available worldwide under an open access license. No special permission is required to reuse all or part of the article published by MDPI, including figures and tables. For articles published under an open access Creative Common CC BY license, any part of the article may be reused without permission provided that the original article is clearly cited. For more information, please refer to https://www.mdpi.com/openaccess .

Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

Feature papers are submitted upon individual invitation or recommendation by the scientific editors and must receive positive feedback from the reviewers.

Editor’s Choice articles are based on recommendations by the scientific editors of MDPI journals from around the world. Editors select a small number of articles recently published in the journal that they believe will be particularly interesting to readers, or important in the respective research area. The aim is to provide a snapshot of some of the most exciting work published in the various research areas of the journal.

Original Submission Date Received: .

  • Active Journals
  • Find a Journal
  • Proceedings Series
  • For Authors
  • For Reviewers
  • For Editors
  • For Librarians
  • For Publishers
  • For Societies
  • For Conference Organizers
  • Open Access Policy
  • Institutional Open Access Program
  • Special Issues Guidelines
  • Editorial Process
  • Research and Publication Ethics
  • Article Processing Charges
  • Testimonials
  • Preprints.org
  • SciProfiles
  • Encyclopedia

algorithms-logo

Journal Menu

  • Algorithms Home
  • Aims & Scope
  • Editorial Board
  • Reviewer Board
  • Topical Advisory Panel
  • Instructions for Authors
  • Special Issues
  • Sections & Collections
  • Article Processing Charge
  • Indexing & Archiving
  • Editor’s Choice Articles
  • Most Cited & Viewed
  • Journal Statistics
  • Journal History
  • Journal Awards
  • Society Collaborations
  • Conferences
  • Editorial Office

Journal Browser

  • arrow_forward_ios Forthcoming issue arrow_forward_ios Current issue
  • Vol. 17 (2024)
  • Vol. 16 (2023)
  • Vol. 15 (2022)
  • Vol. 14 (2021)
  • Vol. 13 (2020)
  • Vol. 12 (2019)
  • Vol. 11 (2018)
  • Vol. 10 (2017)
  • Vol. 9 (2016)
  • Vol. 8 (2015)
  • Vol. 7 (2014)
  • Vol. 6 (2013)
  • Vol. 5 (2012)
  • Vol. 4 (2011)
  • Vol. 3 (2010)
  • Vol. 2 (2009)
  • Vol. 1 (2008)

Find support for a specific problem in the support section of our website.

Please let us know what you think of our products and services.

Visit our dedicated information section to learn more about MDPI.

Parallel and Distributed Computing: Algorithms and Applications

  • Print Collection Flyer
  • Collection Editors
  • Collection Information
  • Published Papers

A topical collection in Algorithms (ISSN 1999-4893). This collection belongs to the section " Parallel and Distributed Algorithms ".

Share This Topical Collection

parallel and distributed computing research paper

Topical Collection Information

Dear Colleagues,

It is an undeniable fact that parallel and distributed computing is ubiquitous now in nearly all computational scenarios ranging from mainstream computing to high-performance and/or distributed architectures such as cloud architectures and supercomputers. The ever-increasing complexity of parallel/distributed systems requires effective algorithmic techniques for unleashing the enormous computational power of these systems and attaining the promising performance of parallel/distributed computing. Moreover, the new possibilities offered by the high-performance systems pave the way to a new genre of applications that were considered as far-fetched a short while ago.

This Topical Collection is focused on all algorithmic aspects of parallel and distributed computing and applications. Essentially, every scenario where multiple operations or tasks are executed at the same time is within the scope of this Topical Collection. Topics of interest include (but are not limited to) the following:

  • Theoretical aspects of parallel and distributed computing;
  • Design and analysis of parallel and distributed algorithms;
  • Algorithm engineering in parallel and distributed computing;
  • Load balancing and scheduling techniques;
  • Green computing;
  • Algorithms and applications for big data, machine learning and artificial intelligence;
  • Game-theoretic approaches in parallel and distributed computing;
  • Algorithms and applications on GPUs and multicore or manycore platforms;
  • Cloud computing, edge/fog computing, IoT and distributed computing;
  • Scientific computing;
  • Simulation and visualization;
  • Graph and irregular applications.

Dr. Charalampos Konstantopoulos Prof. Dr. Grammati Pantziou Collection Editors

Manuscripts should be submitted online at www.mdpi.com by registering and logging in to this website . Once you are registered, click here to go to the submission form . Manuscripts can be submitted until the deadline. All submissions that pass pre-check are peer-reviewed. Accepted papers will be published continuously in the journal (as soon as accepted) and will be listed together on the collection website. Research articles, review articles as well as short communications are invited. For planned papers, a title and short abstract (about 100 words) can be sent to the Editorial Office for announcement on this website.

Submitted manuscripts should not have been published previously, nor be under consideration for publication elsewhere (except conference proceedings papers). All manuscripts are thoroughly refereed through a single-blind peer-review process. A guide for authors and other relevant information for submission of manuscripts is available on the Instructions for Authors page. Algorithms is an international peer-reviewed open access monthly journal published by MDPI.

Please visit the Instructions for Authors page before submitting a manuscript. The Article Processing Charge (APC) for publication in this open access journal is 1600 CHF (Swiss Francs). Submitted papers should be well formatted and use good English. Authors may use MDPI's English editing service prior to publication or during author revisions.

  • Parallel algorithms 
  • Distributed algorithms 
  • Multicore and manycore architectures 
  • Supercomputing 
  • Data centers 
  • Big data 
  • Cloud architectures 

Published Papers (23 papers)

Jump to: 2023 , 2022 , 2021.

parallel and distributed computing research paper

Jump to: 2024 , 2022 , 2021

parallel and distributed computing research paper

Graphical abstract

parallel and distributed computing research paper

Jump to: 2024 , 2023 , 2021

parallel and distributed computing research paper

Jump to: 2024 , 2023 , 2022

parallel and distributed computing research paper

Further Information

Mdpi initiatives, follow mdpi.

MDPI

Subscribe to receive issue release notifications and newsletters from MDPI journals

Journal of Parallel and Distributed Computing

parallel and distributed computing research paper

Subject Area and Category

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Theoretical Computer Science

Academic Press Inc.

Publication type

07437315, 10960848

Information

How to publish in this journal

parallel and distributed computing research paper

The set of journals have been ranked according to their SJR and divided into four equal groups, four quartiles. Q1 (green) comprises the quarter of the journals with the highest values, Q2 (yellow) the second highest values, Q3 (orange) the third highest values and Q4 (red) the lowest values.

CategoryYearQuartile
Artificial Intelligence1999Q2
Artificial Intelligence2000Q2
Artificial Intelligence2001Q3
Artificial Intelligence2002Q3
Artificial Intelligence2003Q3
Artificial Intelligence2004Q2
Artificial Intelligence2005Q2
Artificial Intelligence2006Q2
Artificial Intelligence2007Q2
Artificial Intelligence2008Q2
Artificial Intelligence2009Q2
Artificial Intelligence2010Q2
Artificial Intelligence2011Q2
Artificial Intelligence2012Q3
Artificial Intelligence2013Q3
Artificial Intelligence2014Q2
Artificial Intelligence2015Q2
Artificial Intelligence2016Q2
Artificial Intelligence2017Q2
Artificial Intelligence2018Q2
Artificial Intelligence2019Q2
Artificial Intelligence2020Q2
Artificial Intelligence2021Q1
Artificial Intelligence2022Q2
Artificial Intelligence2023Q2
Computer Networks and Communications1999Q2
Computer Networks and Communications2000Q2
Computer Networks and Communications2001Q2
Computer Networks and Communications2002Q3
Computer Networks and Communications2003Q2
Computer Networks and Communications2004Q2
Computer Networks and Communications2005Q2
Computer Networks and Communications2006Q2
Computer Networks and Communications2007Q2
Computer Networks and Communications2008Q2
Computer Networks and Communications2009Q2
Computer Networks and Communications2010Q2
Computer Networks and Communications2011Q2
Computer Networks and Communications2012Q2
Computer Networks and Communications2013Q2
Computer Networks and Communications2014Q2
Computer Networks and Communications2015Q1
Computer Networks and Communications2016Q1
Computer Networks and Communications2017Q2
Computer Networks and Communications2018Q2
Computer Networks and Communications2019Q2
Computer Networks and Communications2020Q1
Computer Networks and Communications2021Q1
Computer Networks and Communications2022Q1
Computer Networks and Communications2023Q1
Hardware and Architecture1999Q2
Hardware and Architecture2000Q2
Hardware and Architecture2001Q2
Hardware and Architecture2002Q3
Hardware and Architecture2003Q2
Hardware and Architecture2004Q2
Hardware and Architecture2005Q2
Hardware and Architecture2006Q2
Hardware and Architecture2007Q2
Hardware and Architecture2008Q2
Hardware and Architecture2009Q2
Hardware and Architecture2010Q2
Hardware and Architecture2011Q2
Hardware and Architecture2012Q2
Hardware and Architecture2013Q2
Hardware and Architecture2014Q2
Hardware and Architecture2015Q1
Hardware and Architecture2016Q1
Hardware and Architecture2017Q1
Hardware and Architecture2018Q2
Hardware and Architecture2019Q2
Hardware and Architecture2020Q1
Hardware and Architecture2021Q1
Hardware and Architecture2022Q1
Hardware and Architecture2023Q1
Software1999Q2
Software2000Q2
Software2001Q2
Software2002Q3
Software2003Q2
Software2004Q2
Software2005Q2
Software2006Q2
Software2007Q3
Software2008Q2
Software2009Q2
Software2010Q2
Software2011Q2
Software2012Q2
Software2013Q2
Software2014Q2
Software2015Q2
Software2016Q2
Software2017Q2
Software2018Q2
Software2019Q2
Software2020Q2
Software2021Q1
Software2022Q1
Software2023Q1
Theoretical Computer Science1999Q2
Theoretical Computer Science2000Q2
Theoretical Computer Science2001Q2
Theoretical Computer Science2002Q4
Theoretical Computer Science2003Q3
Theoretical Computer Science2004Q2
Theoretical Computer Science2005Q3
Theoretical Computer Science2006Q2
Theoretical Computer Science2007Q3
Theoretical Computer Science2008Q2
Theoretical Computer Science2009Q3
Theoretical Computer Science2010Q3
Theoretical Computer Science2011Q3
Theoretical Computer Science2012Q3
Theoretical Computer Science2013Q3
Theoretical Computer Science2014Q3
Theoretical Computer Science2015Q2
Theoretical Computer Science2016Q2
Theoretical Computer Science2017Q2
Theoretical Computer Science2018Q3
Theoretical Computer Science2019Q3
Theoretical Computer Science2020Q2
Theoretical Computer Science2021Q1
Theoretical Computer Science2022Q1
Theoretical Computer Science2023Q1

The SJR is a size-independent prestige indicator that ranks journals by their 'average prestige per article'. It is based on the idea that 'all citations are not created equal'. SJR is a measure of scientific influence of journals that accounts for both the number of citations received by a journal and the importance or prestige of the journals where such citations come from It measures the scientific influence of the average article in a journal, it expresses how central to the global scientific discussion an average article of the journal is.

YearSJR
19990.358
20000.436
20010.444
20020.312
20030.459
20040.499
20050.489
20060.490
20070.442
20080.586
20090.489
20100.509
20110.485
20120.397
20130.437
20140.548
20150.614
20160.597
20170.502
20180.417
20190.525
20200.638
20211.289
20221.158
20231.187

Evolution of the number of published documents. All types of documents are considered, including citable and non citable documents.

YearDocuments
199971
200067
200191
200284
2003102
2004103
2005124
2006124
200792
2008119
200989
2010104
2011135
2012145
2013143
2014123
201597
201686
2017171
2018207
2019214
2020170
2021175
2022155
2023125

This indicator counts the number of citations received by documents from a journal and divides them by the total number of documents published in that journal. The chart shows the evolution of the average number of times documents published in a journal in the past two, three and four years have been cited in the current year. The two years line is equivalent to journal impact factor ™ (Thomson Reuters) metric.

Cites per documentYearValue
Cites / Doc. (4 years)19990.890
Cites / Doc. (4 years)20001.014
Cites / Doc. (4 years)20010.997
Cites / Doc. (4 years)20020.854
Cites / Doc. (4 years)20031.058
Cites / Doc. (4 years)20041.535
Cites / Doc. (4 years)20051.905
Cites / Doc. (4 years)20061.685
Cites / Doc. (4 years)20071.704
Cites / Doc. (4 years)20081.603
Cites / Doc. (4 years)20091.845
Cites / Doc. (4 years)20102.179
Cites / Doc. (4 years)20112.495
Cites / Doc. (4 years)20122.244
Cites / Doc. (4 years)20132.228
Cites / Doc. (4 years)20142.624
Cites / Doc. (4 years)20152.484
Cites / Doc. (4 years)20162.868
Cites / Doc. (4 years)20173.131
Cites / Doc. (4 years)20182.994
Cites / Doc. (4 years)20193.198
Cites / Doc. (4 years)20203.729
Cites / Doc. (4 years)20214.349
Cites / Doc. (4 years)20224.872
Cites / Doc. (4 years)20235.183
Cites / Doc. (3 years)19990.890
Cites / Doc. (3 years)20001.116
Cites / Doc. (3 years)20010.825
Cites / Doc. (3 years)20020.777
Cites / Doc. (3 years)20031.066
Cites / Doc. (3 years)20041.552
Cites / Doc. (3 years)20051.869
Cites / Doc. (3 years)20061.775
Cites / Doc. (3 years)20071.390
Cites / Doc. (3 years)20081.591
Cites / Doc. (3 years)20091.910
Cites / Doc. (3 years)20102.397
Cites / Doc. (3 years)20112.429
Cites / Doc. (3 years)20122.079
Cites / Doc. (3 years)20132.313
Cites / Doc. (3 years)20142.461
Cites / Doc. (3 years)20152.455
Cites / Doc. (3 years)20162.981
Cites / Doc. (3 years)20173.271
Cites / Doc. (3 years)20182.669
Cites / Doc. (3 years)20193.134
Cites / Doc. (3 years)20203.939
Cites / Doc. (3 years)20214.997
Cites / Doc. (3 years)20225.211
Cites / Doc. (3 years)20235.174
Cites / Doc. (2 years)19990.887
Cites / Doc. (2 years)20000.907
Cites / Doc. (2 years)20010.746
Cites / Doc. (2 years)20020.753
Cites / Doc. (2 years)20031.074
Cites / Doc. (2 years)20041.575
Cites / Doc. (2 years)20051.971
Cites / Doc. (2 years)20061.256
Cites / Doc. (2 years)20071.222
Cites / Doc. (2 years)20081.713
Cites / Doc. (2 years)20092.085
Cites / Doc. (2 years)20102.264
Cites / Doc. (2 years)20111.948
Cites / Doc. (2 years)20122.113
Cites / Doc. (2 years)20132.079
Cites / Doc. (2 years)20142.351
Cites / Doc. (2 years)20152.466
Cites / Doc. (2 years)20162.882
Cites / Doc. (2 years)20172.863
Cites / Doc. (2 years)20182.261
Cites / Doc. (2 years)20193.204
Cites / Doc. (2 years)20204.468
Cites / Doc. (2 years)20215.401
Cites / Doc. (2 years)20224.965
Cites / Doc. (2 years)20234.736

Evolution of the total number of citations and journal's self-citations received by a journal's published documents during the three previous years. Journal Self-citation is defined as the number of citation from a journal citing article to articles published by the same journal.

CitesYearValue
Self Cites199917
Self Cites200013
Self Cites20019
Self Cites20024
Self Cites200311
Self Cites200413
Self Cites200520
Self Cites200618
Self Cites200710
Self Cites200830
Self Cites200914
Self Cites201018
Self Cites201119
Self Cites201223
Self Cites201341
Self Cites201435
Self Cites201523
Self Cites201624
Self Cites201735
Self Cites201842
Self Cites201974
Self Cites202064
Self Cites202166
Self Cites202237
Self Cites202371
Total Cites1999325
Total Cites2000317
Total Cites2001179
Total Cites2002178
Total Cites2003258
Total Cites2004430
Total Cites2005540
Total Cites2006584
Total Cites2007488
Total Cites2008541
Total Cites2009640
Total Cites2010719
Total Cites2011758
Total Cites2012682
Total Cites2013888
Total Cites20141041
Total Cites20151009
Total Cites20161082
Total Cites20171001
Total Cites2018945
Total Cites20191454
Total Cites20202332
Total Cites20212953
Total Cites20222913
Total Cites20232587

Evolution of the number of total citation per document and external citation per document (i.e. journal self-citations removed) received by a journal's published documents during the three previous years. External citations are calculated by subtracting the number of self-citations from the total number of citations received by the journal’s documents.

CitesYearValue
External Cites per document19990.844
External Cites per document20001.070
External Cites per document20010.783
External Cites per document20020.760
External Cites per document20031.021
External Cites per document20041.505
External Cites per document20051.799
External Cites per document20061.720
External Cites per document20071.362
External Cites per document20081.503
External Cites per document20091.869
External Cites per document20102.337
External Cites per document20112.369
External Cites per document20122.009
External Cites per document20132.206
External Cites per document20142.378
External Cites per document20152.399
External Cites per document20162.915
External Cites per document20173.157
External Cites per document20182.551
External Cites per document20192.974
External Cites per document20203.831
External Cites per document20214.885
External Cites per document20225.145
External Cites per document20235.032
Cites per document19990.890
Cites per document20001.116
Cites per document20010.825
Cites per document20020.777
Cites per document20031.066
Cites per document20041.552
Cites per document20051.869
Cites per document20061.775
Cites per document20071.390
Cites per document20081.591
Cites per document20091.910
Cites per document20102.397
Cites per document20112.429
Cites per document20122.079
Cites per document20132.313
Cites per document20142.461
Cites per document20152.455
Cites per document20162.981
Cites per document20173.271
Cites per document20182.669
Cites per document20193.134
Cites per document20203.939
Cites per document20214.997
Cites per document20225.211
Cites per document20235.174

International Collaboration accounts for the articles that have been produced by researchers from several countries. The chart shows the ratio of a journal's documents signed by researchers from more than one country; that is including more than one country address.

YearInternational Collaboration
199923.94
200026.87
200118.68
200227.38
200326.47
200425.24
200526.61
200628.23
200727.17
200825.21
200925.84
201032.69
201126.67
201229.66
201330.77
201430.08
201537.11
201629.07
201732.75
201843.96
201937.85
202048.24
202135.43
202238.06
202338.40

Not every article in a journal is considered primary research and therefore "citable", this chart shows the ratio of a journal's articles including substantial research (research articles, conference papers and reviews) in three year windows vs. those documents other than research articles, reviews and conference papers.

DocumentsYearValue
Non-citable documents19992
Non-citable documents20001
Non-citable documents20010
Non-citable documents20025
Non-citable documents20038
Non-citable documents200414
Non-citable documents20059
Non-citable documents20068
Non-citable documents20078
Non-citable documents200810
Non-citable documents200911
Non-citable documents20107
Non-citable documents20116
Non-citable documents20127
Non-citable documents20137
Non-citable documents201413
Non-citable documents201512
Non-citable documents201613
Non-citable documents20178
Non-citable documents201811
Non-citable documents201915
Non-citable documents202024
Non-citable documents202120
Non-citable documents202217
Non-citable documents202310
Citable documents1999363
Citable documents2000283
Citable documents2001217
Citable documents2002224
Citable documents2003234
Citable documents2004263
Citable documents2005280
Citable documents2006321
Citable documents2007343
Citable documents2008330
Citable documents2009324
Citable documents2010293
Citable documents2011306
Citable documents2012321
Citable documents2013377
Citable documents2014410
Citable documents2015399
Citable documents2016350
Citable documents2017298
Citable documents2018343
Citable documents2019449
Citable documents2020568
Citable documents2021571
Citable documents2022542
Citable documents2023490

Ratio of a journal's items, grouped in three years windows, that have been cited at least once vs. those not cited during the following year.

DocumentsYearValue
Uncited documents1999201
Uncited documents2000156
Uncited documents2001134
Uncited documents2002142
Uncited documents2003134
Uncited documents2004143
Uncited documents2005132
Uncited documents2006137
Uncited documents2007158
Uncited documents2008148
Uncited documents2009119
Uncited documents201095
Uncited documents201194
Uncited documents2012129
Uncited documents2013129
Uncited documents2014132
Uncited documents2015134
Uncited documents2016103
Uncited documents201791
Uncited documents2018105
Uncited documents2019133
Uncited documents2020151
Uncited documents2021141
Uncited documents2022137
Uncited documents2023103
Cited documents1999164
Cited documents2000128
Cited documents200183
Cited documents200287
Cited documents2003108
Cited documents2004134
Cited documents2005157
Cited documents2006192
Cited documents2007193
Cited documents2008192
Cited documents2009216
Cited documents2010205
Cited documents2011218
Cited documents2012199
Cited documents2013255
Cited documents2014291
Cited documents2015277
Cited documents2016260
Cited documents2017215
Cited documents2018249
Cited documents2019331
Cited documents2020441
Cited documents2021450
Cited documents2022422
Cited documents2023397

Evolution of the percentage of female authors.

YearFemale Percent
199914.94
200013.51
200115.92
200213.83
200317.11
200415.79
200519.34
200615.00
200720.24
200815.43
200918.53
201019.73
201122.19
201219.62
201316.90
201418.06
201516.99
201621.72
201719.29
201821.38
201919.50
202020.00
202118.18
202222.57
202323.94

Evolution of the number of documents cited by public policy documents according to Overton database.

DocumentsYearValue
Overton19991
Overton20001
Overton20010
Overton20020
Overton20030
Overton20040
Overton20053
Overton20061
Overton20070
Overton20082
Overton20091
Overton20101
Overton20110
Overton20120
Overton20132
Overton20143
Overton20151
Overton20160
Overton20172
Overton20182
Overton20194
Overton20200
Overton20210
Overton20220
Overton20230

Evoution of the number of documents related to Sustainable Development Goals defined by United Nations. Available from 2018 onwards.

DocumentsYearValue
SDG201830
SDG201936
SDG202026
SDG202116
SDG202220
SDG202312

Scimago Journal & Country Rank

Leave a comment

Name * Required

Email (will not be published) * Required

* Required Cancel

The users of Scimago Journal & Country Rank have the possibility to dialogue through comments linked to a specific journal. The purpose is to have a forum in which general doubts about the processes of publication in the journal, experiences and other issues derived from the publication of papers are resolved. For topics on particular articles, maintain the dialogue through the usual channels with your editor.

Scimago Lab

Follow us on @ScimagoJR Scimago Lab , Copyright 2007-2024. Data Source: Scopus®

parallel and distributed computing research paper

Cookie settings

Cookie Policy

Legal Notice

Privacy Policy

Help | Advanced Search

Distributed, Parallel, and Cluster Computing

  • New submissions
  • Cross-lists
  • Replacements

See recent articles

New submissions for Thursday, 5 September 2024 (showing 5 of 5 entries )

The fault tolerance method currently used in High Performance Computing (HPC) is the rollback-recovery method by using checkpoints. This, like any other fault tolerance method, adds an additional energy consumption to that of the execution of the application. The objective of this work is to determine the factors that affect the energy consumption of the computing nodes on homogeneous cluster, when performing checkpoint and restart operations, on SPMD (Single Program Multiple Data) applications. We have focused on the energetic study of compute nodes, contemplating different configurations of hardware and software parameters. We studied the effect of performance states (states P) and power states (states C) of processors, application problem size, checkpoint software (DMTCP) and distributed file system (NFS) configuration. The results analysis allowed to identify opportunities to reduce the energy consumption of checkpoint and restart operations.

This work examines the resilience properties of the Snowball and Avalanche protocols that underlie the popular Avalanche blockchain. We experimentally quantify the resilience of Snowball using a simulation implemented in Rust, where the adversary strategically rebalances the network to delay termination. We show that in a network of $n$ nodes of equal stake, the adversary is able to break liveness when controlling $\Omega(\sqrt{n})$ nodes. Specifically, for $n = 2000$, a simple adversary controlling $5.2\%$ of stake can successfully attack liveness. When the adversary is given additional information about the state of the network (without any communication or other advantages), the stake needed for a successful attack is as little as $2.8\%$. We show that the adversary can break safety in time exponentially dependent on their stake, and inversely linearly related to the size of the network, e.g. in 265 rounds in expectation when the adversary controls $25\%$ of a network of 3000. We conclude that Snowball and Avalanche are akin to Byzantine reliable broadcast protocols as opposed to consensus.

Data Parallelism (DP), Tensor Parallelism (TP), and Pipeline Parallelism (PP) are the three strategies widely adopted to enable fast and efficient Large Language Model (LLM) training. However, these approaches rely on data-intensive communication routines to collect, aggregate, and re-distribute gradients, activations, and other important model information, which pose significant overhead. Co-designed with GPU-based compression libraries, MPI libraries have been proven to reduce message size significantly, and leverage interconnect bandwidth, thus increasing training efficiency while maintaining acceptable accuracy. In this work, we investigate the efficacy of compression-assisted MPI collectives under the context of distributed LLM training using 3D parallelism and ZeRO optimizations. We scaled up to 192 V100 GPUs on the Lassen supercomputer. First, we enabled a naïve compression scheme across all collectives and observed a 22.5\% increase in TFLOPS per GPU and a 23.6\% increase in samples per second for GPT-NeoX-20B training. Nonetheless, such a strategy ignores the sparsity discrepancy among messages communicated in each parallelism degree, thus introducing more errors and causing degradation in training loss. Therefore, we incorporated hybrid compression settings toward each parallel dimension and adjusted the compression intensity accordingly. Given their low-rank structure ( arXiv:2301.02654 ), we apply aggressive compression on gradients when performing DP All-reduce. We adopt milder compression to preserve precision while communicating activations, optimizer states, and model parameters in TP and PP. Using the adjusted hybrid compression scheme, we demonstrate a 17.3\% increase in TFLOPS per GPU and a 12.7\% increase in samples per second while reaching baseline loss convergence.

Computation offloading at lower time and lower energy consumption is crucial for resource limited mobile devices. This paper proposes an offloading decision-making model using federated learning. Based on the task type and the user input, the proposed decision-making model predicts whether the task is computationally intensive or not. If the predicted result is computationally intensive, then based on the network parameters the proposed decision-making model predicts whether to offload or locally execute the task. According to the predicted result the task is either locally executed or offloaded to the edge server. The proposed method is implemented in a real-time environment, and the experimental results show that the proposed method has achieved above 90% prediction accuracy in offloading decision-making. The experimental results also present that the proposed offloading method reduces the response time and energy consumption of the user device by ~11-31% for computationally intensive tasks. A partial computation offloading method for federated learning is also proposed and implemented in this paper, where the devices which are unable to analyse the huge number of data samples, offload a part of their local datasets to the edge server. For secure data transmission, cryptography is used. The experimental results present that using encryption and decryption the total time is increased by only 0.05-0.16%. The results also present that the proposed partial computation offloading method for federated learning has achieved a prediction accuracy of above 98% for the global model.

MPI+X has been the de facto standard for distributed memory parallel programming. It is widely used primarily as an explicit two-sided communication model, which often leads to complex and error-prone code. Alternatively, PGAS model utilizes efficient one-sided communication and more intuitive communication primitives. In this paper, we present a novel approach that integrates PGAS concepts into the OpenMP programming model, leveraging the LLVM compiler infrastructure and the GASNet-EX communication library. Our model addresses the complexity associated with traditional MPI+OpenMP programming models while ensuring excellent performance and scalability. We evaluate our approach using a set of micro-benchmarks and application kernels on two distinct platforms: Ookami from Stony Brook University and NERSC Perlmutter. The results demonstrate that DiOMP achieves superior bandwidth and lower latency compared to MPI+OpenMP, up to 25% higher bandwidth and down to 45% on latency. DiOMP offers a promising alternative to the traditional MPI+OpenMP hybrid programming model, towards providing a more productive and efficient way to develop high-performance parallel applications for distributed memory systems.

Cross submissions for Thursday, 5 September 2024 (showing 4 of 4 entries )

Bugs in popular distributed protocol implementations have been the source of many downtimes in popular internet services. We describe a randomized testing approach for distributed protocol implementations based on reinforcement learning. Since the natural reward structure is very sparse, the key to successful exploration in reinforcement learning is reward augmentation. We show two different techniques that build on one another. First, we provide a decaying exploration bonus based on the discovery of new states -- the reward decays as the same state is visited multiple times. The exploration bonus captures the intuition from coverage-guided fuzzing of prioritizing new coverage points; in contrast to other schemes, we show that taking the maximum of the bonus and the Q-value leads to more effective exploration. Second, we provide waypoints to the algorithm as a sequence of predicates that capture interesting semantic scenarios. Waypoints exploit designer insight about the protocol and guide the exploration to ``interesting'' parts of the state space. Our reward structure ensures that new episodes can reliably get to deep interesting states even without execution caching. We have implemented our algorithm in Go. Our evaluation on three large benchmarks (RedisRaft, Etcd, and RSL) shows that our algorithm can significantly outperform baseline approaches in terms of coverage and bug finding.

Modern accelerators like GPUs are increasingly executing independent operations concurrently to improve the device's compute utilization. However, effectively harnessing it on GPUs for important primitives such as general matrix multiplications (GEMMs) remains challenging. Although modern GPUs have significant hardware and software support for GEMMs, their kernel implementations and optimizations typically assume each kernel executes in isolation and can utilize all GPU resources. This approach is highly efficient when kernels execute in isolation, but causes significant resource contention and slowdowns when kernels execute concurrently. Moreover, current approaches often only statically expose and control parallelism within an application, without considering runtime information such as varying input size and concurrent applications -- often exacerbating contention. These issues limit performance benefits from concurrently executing independent operations. Accordingly, we propose GOLDYLOC, which considers the global resources across all concurrent operations to identify performant GEMM kernels, which we call globally optimized (GO)-Kernels. Moreover, GOLDYLOC introduces a lightweight dynamic logic which considers the dynamic execution environment for available parallelism and input sizes to execute performant combinations of concurrent GEMMs on the GPU. Overall, GOLDYLOC improves performance of concurrent GEMMs on a real GPU by up to 2$\times$ (18% geomean per workload) and provides up to 2.5$\times$ (43% geomean per workload) speedups over sequential execution.

Fortran's prominence in scientific computing requires strategies to ensure both that legacy codes are efficient on high-performance computing systems, and that the language remains attractive for the development of new high-performance codes. Coarray Fortran (CAF), part of the Fortran 2008 standard introduced for parallel programming, facilitates distributed memory parallelism with a syntax familiar to Fortran programmers, simplifying the transition from single-processor to multi-processor coding. This research focuses on innovating and refining a parallel programming methodology that fuses the strengths of Intel Coarray Fortran, Nvidia CUDA Fortran, and OpenMP for distributed memory parallelism, high-speed GPU acceleration and shared memory parallelism respectively. We consider the management of pageable and pinned memory, CPU-GPU affinity in NUMA multiprocessors, and robust compiler interfacing with speed optimisation. We demonstrate our method through its application to a parallelised Poisson solver and compare the methodology, implementation, and scaling performance to that of the Message Passing Interface (MPI), finding CAF offers similar speeds with easier implementation. For new codes, this approach offers a faster route to optimised parallel computing. For legacy codes, it eases the transition to parallel computing, allowing their transformation into scalable, high-performance computing applications without the need for extensive re-design or additional syntax.

Parameter-Efficient Fine-Tuning (PEFT) has risen as an innovative training strategy that updates only a select few model parameters, significantly lowering both computational and memory demands. PEFT also helps to decrease data transfer in federated learning settings, where communication depends on the size of updates. In this work, we explore the constraints of previous studies that integrate a well-known PEFT method named LoRA with federated fine-tuning, then introduce RoLoRA, a robust federated fine-tuning framework that utilizes an alternating minimization approach for LoRA, providing greater robustness against decreasing fine-tuning parameters and increasing data heterogeneity. Our results indicate that RoLoRA not only presents the communication benefits but also substantially enhances the robustness and effectiveness in multiple federated fine-tuning scenarios.

Replacement submissions for Thursday, 5 September 2024 (showing 8 of 8 entries )

Decentralized Federated Learning (DFL) emerges as an innovative paradigm to train collaborative models, addressing the single point of failure limitation. However, the security and trustworthiness of FL and DFL are compromised by poisoning attacks, negatively impacting its performance. Existing defense mechanisms have been designed for centralized FL and they do not adequately exploit the particularities of DFL. Thus, this work introduces Sentinel, a defense strategy to counteract poisoning attacks in DFL. Sentinel leverages the accessibility of local data and defines a three-step aggregation protocol consisting of similarity filtering, bootstrap validation, and normalization to safeguard against malicious model updates. Sentinel has been evaluated with diverse datasets and data distributions. Besides, various poisoning attack types and threat levels have been verified. The results improve the state-of-the-art performance against both untargeted and targeted poisoning attacks when data follows an IID (Independent and Identically Distributed) configuration. Besides, under non-IID configuration, it is analyzed how performance degrades both for Sentinel and other state-of-the-art robust aggregation methods.

Many computational chemistry and molecular simulation workflows can be expressed as graphs. This abstraction is useful to modularize and potentially reuse existing components, as well as provide parallelization and ease reproducibility. Existing tools represent the computation as a directed acyclic graph (DAG), thus allowing efficient execution by parallelization of concurrent branches. These systems can, however, generally not express cyclic and conditional workflows. We therefore developed Maize, a workflow manager for cyclic and conditional graphs based on the principles of flow-based programming. By running each node of the graph concurrently in separate processes and allowing communication at any time through dedicated inter-node channels, arbitrary graph structures can be executed. We demonstrate the effectiveness of the tool on a dynamic active learning task in computational drug design, involving the use of a small molecule generative model and an associated scoring system, and on a reactivity prediction pipeline using quantum-chemistry and semiempirical approaches.

Distributed optimization is the standard way of speeding up machine learning training, and most of the research in the area focuses on distributed first-order, gradient-based methods. Yet, there are settings where some computationally-bounded nodes may not be able to implement first-order, gradient-based optimization, while they could still contribute to joint optimization tasks. In this paper, we initiate the study of hybrid decentralized optimization, studying settings where nodes with zeroth-order and first-order optimization capabilities co-exist in a distributed system, and attempt to jointly solve an optimization task over some data distribution. We essentially show that, under reasonable parameter settings, such a system can not only withstand noisier zeroth-order agents but can even benefit from integrating such agents into the optimization process, rather than ignoring their information. At the core of our approach is a new analysis of distributed optimization with noisy and possibly-biased gradient estimators, which may be of independent interest. Our results hold for both convex and non-convex objectives. Experimental results on standard optimization tasks confirm our analysis, showing that hybrid first-zeroth order optimization can be practical, even when training deep neural networks.

A range of data insight analytical tasks involves analyzing a large set of tables of different schemas, possibly induced by various groupings, to find salient patterns. This paper presents Multi-Relational Algebra, an extension of the classic Relational Algebra, to facilitate such transformations and their compositions. Multi-Relational Algebra has two main characteristics: (1) Information Unit. The information unit is a slice $(r, X)$, where $r$ is a (region) tuple, and $X$ is a (feature) table. Specifically, a slice can encompass multiple columns, which surpasses the information unit of "a single tuple" or "a group of tuples of one column" in the classic relational algebra, (2) Schema Flexibility. Slices can have varying schemas, not constrained to a single schema. This flexibility further expands the expressive power of the algebra. Through various examples, we show that multi-relational algebra can effortlessly express many complex analytic problems, some of which are beyond the scope of traditional relational analytics. We have implemented and deployed a service for multi-relational analytics. Due to a unified logical design, we are able to conduct systematic optimization for a variety of seemingly different tasks. Our service has garnered interest from numerous internal teams who have developed data-insight applications using it, and serves millions of operators daily.

Spectral Deferred Correction (SDC) is an iterative method for the numerical solution of ordinary differential equations. It works by refining the numerical solution for an initial value problem by approximately solving differential equations for the error, and can be interpreted as a preconditioned fixed-point iteration for solving the fully implicit collocation problem. We adopt techniques from embedded Runge-Kutta Methods (RKM) to SDC in order to provide a mechanism for adaptive time step size selection and thus increase computational efficiency of SDC. We propose two SDC-specific estimates of the local error that are generic and do not rely on problem specific quantities. We demonstrate a gain in efficiency over standard SDC with fixed step size and compare efficiency favorably against state-of-the-art adaptive RKM.

Federated Learning (FL) is an interesting strategy that enables the collaborative training of an AI model among different data owners without revealing their private datasets. Even so, FL has some privacy vulnerabilities that have been tried to be overcome by applying some techniques like Differential Privacy (DP), Homomorphic Encryption (HE), or Secure Multi-Party Computation (SMPC). However, these techniques have some important drawbacks that might narrow their range of application: problems to work with non-linear functions and to operate large matrix multiplications and high communication and computational costs to manage semi-honest nodes. In this context, we propose a solution to guarantee privacy in FL schemes that simultaneously solves the previously mentioned problems. Our proposal is based on the Berrut Approximated Coded Computing, a technique from the Coded Distributed Computing paradigm, adapted to a Secret Sharing configuration, to provide input privacy to FL in a scalable way. It can be applied for computing non-linear functions and treats the special case of distributed matrix multiplication, a key primitive at the core of many automated learning tasks. Because of these characteristics, it could be applied in a wide range of FL scenarios, since it is independent of the machine learning models or aggregation algorithms used in the FL scheme. We provide analysis of the achieved privacy and complexity of our solution and, due to the extensive numerical results performed, a good trade-off between privacy and precision can be observed.

Decentralized Intelligence Network (DIN) is a theoretical framework designed to address challenges in AI development, particularly focusing on data fragmentation and siloing issues. It facilitates effective AI training within sovereign data networks by overcoming barriers to accessing diverse data sources, leveraging: 1) personal data stores to ensure data sovereignty, where data remains securely within Participants' control; 2) a scalable federated learning protocol implemented on a public blockchain for decentralized AI training, where only model parameter updates are shared, keeping data within the personal data stores; and 3) a scalable, trustless cryptographic rewards mechanism on a public blockchain to incentivize participation and ensure fair reward distribution through a decentralized auditing protocol. This approach guarantees that no entity can prevent or control access to training data or influence financial benefits, as coordination and reward distribution are managed on the public blockchain with an immutable record. The framework supports effective AI training by allowing Participants to maintain control over their data, benefit financially, and contribute to a decentralized, scalable ecosystem that leverages collective AI to develop beneficial algorithms.

Decentralized Health Intelligence Network (DHIN) extends the Decentralized Intelligence Network (DIN) framework to address challenges in healthcare data sovereignty and AI utilization. Building upon DIN's core principles, DHIN introduces healthcare-specific components to tackle data fragmentation across providers and institutions, establishing a sovereign architecture for healthcare provision. It facilitates effective AI utilization by overcoming barriers to accessing diverse health data sources. This comprehensive framework leverages: 1) self-sovereign identity architecture coupled with a personal health record (PHR), extending DIN's personal data stores concept to ensure health data sovereignty; 2) a scalable federated learning (FL) protocol implemented on a public blockchain for decentralized AI training in healthcare, tailored for medical data; and 3) a scalable, trustless rewards mechanism adapted from DIN to incentivize participation in healthcare AI development. DHIN operates on a public blockchain with an immutable record, ensuring that no entity can control access to health data or determine financial benefits. It supports effective AI training while allowing patients to maintain control over their health data, benefit financially, and contribute to a decentralized ecosystem. Unique to DHIN, patients receive rewards in digital wallets as an incentive to opt into the FL protocol, with a long-term roadmap to fund decentralized insurance solutions. This approach introduces a novel, self-financed healthcare model that adapts to individual needs, complements existing systems, and redefines universal coverage, showcasing how DIN principles can transform healthcare data management and AI utilization while empowering patients.

Constrained Approximate Query Processing with Error and Response Time-Bound Guarantees for Efficient Big Data Analytics

New citation alert added.

This alert has been successfully added and will be sent to:

You will be notified whenever a record that you have chosen has been cited.

To manage your alert preferences, click on the button below.

New Citation Alert!

Please log in to your account

Information & Contributors

Bibliometrics & citations, view options, index terms.

Information systems

Data management systems

Database management system engines

Database query processing

Query operators

Query optimization

Query planning

Recommendations

Approximate query processing with error guarantees.

In recent years, with the increase of data and the sophistication of analysis requirements, query processing in databases has become more important. Recently, approximate query processing (AQP) was proposed for efficiently executing database ...

Approximate Query Processing Based on Approximate Materialized View

In the context of big data, the interactive analysis database system needs to answer aggregate queries within a reasonable response time. The proposed AQP++ framework can integrate data preprocessing and AQP. It connects existing AQP engine with ...

Approximate Query Processing: No Silver Bullet

In this paper, we reflect on the state of the art of Approximate Query Processing. Although much technical progress has been made in this area of research, we are yet to see its impact on products and services. We discuss two promising avenues to pursue ...

Information

Published in.

cover image ACM Conferences

  • Patrizio Dazzi ,
  • Gabriele Mencagli ,
  • Program Chair:
  • David Lowenthal ,
  • Program Co-chair:
  • Rosa M Badia
  • SIGARCH: ACM Special Interest Group on Computer Architecture

In-Cooperation

  • SIGHPC: ACM Special Interest Group on High Performance Computing, Special Interest Group on High Performance Computing

Association for Computing Machinery

New York, NY, United States

Publication History

Check for updates, author tags.

  • approximate query processing
  • exploratory data analysis
  • query optimization
  • Short-paper

Funding Sources

  • Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT)

Acceptance Rates

Contributors, other metrics, bibliometrics, article metrics.

  • 0 Total Citations
  • 8 Total Downloads
  • Downloads (Last 12 months) 8
  • Downloads (Last 6 weeks) 8

View options

View or Download as a PDF file.

View online with eReader .

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Share this publication link.

Copying failed.

Share on social media

Affiliations, export citations.

  • Please download or close your previous search result export first before starting a new bulk export. Preview is not available. By clicking download, a status dialog will open to start the export process. The process may take a few minutes but once it finishes a file will be downloadable from your browser. You may continue to browse the DL while the export process is in progress. Download
  • Download citation
  • Copy citation

We are preparing your search results for download ...

We will inform you here when the file is ready.

Your file of search results citations is now ready.

Your search export query has expired. Please try again.

A systematic literature review for load balancing and task scheduling techniques in cloud computing

  • Open access
  • Published: 05 September 2024
  • Volume 57 , article number  276 , ( 2024 )

Cite this article

You have full access to this open access article

parallel and distributed computing research paper

  • Nisha Devi 1 ,
  • Sandeep Dalal 1 ,
  • Kamna Solanki 2 ,
  • Surjeet Dalal 3 ,
  • Umesh Kumar Lilhore 4 ,
  • Sarita Simaiya 4 &
  • Nasratullah Nuristani 5  

Cloud computing is an emerging technology composed of several key components that work together to create a seamless network of interconnected devices. These interconnected devices, such as sensors, routers, smartphones, and smart appliances, are the foundation of the Internet of Everything (IoE). Huge volumes of data generated by IoE devices are processed and accumulated in the cloud, allowing for real-time analysis and insights. As a result, there is a dire need for load-balancing and task-scheduling techniques in cloud computing. The primary objective of these techniques is to divide the workload evenly across all available resources and handle other issues like reducing execution time and response time, increasing throughput and fault detection. This systematic literature review (SLR) aims to analyze various technologies comprising optimization and machine learning algorithms used for load balancing and task-scheduling problems in a cloud computing environment. To analyze the load-balancing patterns and task-scheduling techniques, we opted for a representative set of 63 research articles written in English from 2014 to 2024 that has been selected using suitable exclusion-inclusion criteria. The SLR aims to minimize bias and increase objectivity by designing research questions about the topic. We have focused on the technologies used, the merits-demerits of diverse technologies, gaps within the research, insights into tools, forthcoming opportunities, performance metrics, and an in-depth investigation into ML-based optimization techniques.

Explore related subjects

  • Artificial Intelligence

Avoid common mistakes on your manuscript.

1 Introduction

The surge in IoT device usage has led to the emergence of cloud computing as a significant research focus. It offers a variety of services in many different application areas, with the highest level of flexibility and scalability. The high growth of information and communication technologies (ICT) has resulted in integrating big data with the IoT, revolutionizing cloud services. Within this transformative framework, cloud computing is pivotal in enabling efficient and scalable solutions for managing big data. Numerous cloud service providers enable organizations to obtain the optimal software, storage, and hardware facilities needed to accomplish their goals at a much more affordable cost. Customers subscribe to the services they require under the cloud computing paradigm and sign a service level agreement (SLA) with the cloud vendor, outlining the quality of service (QoS) and conditions of service provision. Table 1 presents the service control that the various cloud service models offer to end-users. Load balancing is a method that distributes tasks among virtual machines (VMs) using a Virtual Machine Manager (VMM). It assists in handling different types of workloads, such as CPU, network, and memory demands (Buyya 2018 ) (Mishra and Majhi 2020 ). The cloud computing infrastructure has three significant challenges: virtualization, distributed frameworks, and load balancing. The load-balancing problem is defined as the allocation of workloads among the processing modules. In a multi-node environment, it is quite probable that certain nodes will experience excessive workload while others will remain inactive. Load unbalancing is a harmful event for cloud service providers (CSPs), as it diminishes the dependability and effectiveness of computing services while also putting at risk the quality of service (QoS) guaranteed in the service level agreement (SLA) between the customer and the cloud service provider (Oduwole et al. 2022 ). Verma et al. ( 2024 ) introduced a load-balancing methodology, utilizing genetic algorithms (GA), to improve the quality of the telemedicine industry by efficiently adapting to changing workloads and network conditions at the fog level. The flexibility to adapt can enhance patient care and provide scalability for future healthcare systems. Walia et al. ( 2023 ) cover several emerging technologies in their survey, including Software-Defined Networking (SDN), Blockchain, Digital Twins, Industrial IoT (IIoT), 5G, Serverless computing, and quantum computing. These technologies can be incorporated with the current fog/edge-of-things models for improved analysis and provide business intelligence for IoT platforms. Adaptive resource management strategies are necessary for efficient scheduling and decision-offloading due to the infrastructural efficiency of these computing paradigms.

1.1 Need for load balancing, factors affecting and associated challenges

Intelligent Computing Resource Management (ICRM) is rapidly evolving to meet the increasing needs of businesses and sectors, driven by the proliferation of Internet-based technologies, cloud computing, and cyber-physical systems. With the rise of information-intensive applications, artificial intelligence, cloud computing, and IoT, intelligent computing monitoring and resource allocation have become crucial (Biswas et al. 2024 ). Cloud data centers typically need to be optimized because they are built to handle hundreds of loads, which could result in low resource utilization and energy waste. The goals of load balancing include reduced job execution times, optimal resource utilization, and high system throughput. Load balancing reduces the overall resource waiting time and avoids resource overload (Apat et al. 2023 ). In terms of the equilibrium load distribution, load balancing between virtual machines (VMs) is an NP-hard problem. The difficulty of this problem can be determined by taking two elements into account: huge solution spaces and polynomial-bounded computing. The load can be characterized as under-load, overloaded, or balanced in a cloud computing environment. Identifying overloaded and under-loaded nodes and then distributing the load across them is critical to load balancing (Santhanakrishnan and Valarmathi 2022 ). With the emergence of technology, many challenges have also ushered in a sequence. These challenges include storage capacity, high processing speed, low latency, fast transmission, load balancing, efficient routing, cost efficiency, etc. Load balancing is a crucial optimisation procedure in cloud computing, and achieving this objective depends on dynamic resource allocation. Some factors that affect load balancing in cloud computing are as follows:

Workload patterns: The variating workload, unpredictable traffic patterns, and heterogeneous applications may affect the efficiency of the cloud system.

Geographical distribution: The cloud data centres are generally located in remote areas that contribute to transmission delays. So, fog computing and edge computing are required to reduce these delays. We must efficiently manage the limited resources of the fog and edge devices.

Cost and budget constraints: Cost considerations have a big impact on load-balancing strategies. It frequently aims to use less expensive resources or minimize idle assets.

The dynamic nature of applications and monitoring necessitates the elasticity and scalability of cloud services. In addition, inadequate monitoring makes it challenging to balance the load.

SLA agreements and breaches: SLA violations are impacted by the services offered by cloud service providers. It is quite necessary to maintain the quality without compromising other factors like throughput, makespan, energy consumption, and cost.

Virtual Machine (VM) Migrations: An increase in the number of VM migrations leads to a decrease in service quality. While VM migration can be beneficial to some extent, its frequency can lead to an increase in time complexity. It takes a lot of time to transfer data from one VM to another, including copying memory pages to the host machine.

Resource availability: Insufficient resources, such as CPU, memory, or bandwidth, limit the load balancing efficiency.

Energy consumption is a critical factor in data centers. Load balancing is very necessary to reduce energy consumption by migrating VMs from overloaded resources to underloaded hosts.

Other factors like fault tolerance, predictive analytics, network latency and data security also affect load balancing in a cloud system. We have divided the technologies reviewed through this SLR into five categories: conventional/traditional, heuristic, meta-heuristic, ML-Centric and Hybrid. Traditional approaches to cloud computing resource allocation and load balancing are time-consuming, unable to yield fast results, and frequently trapped in local optima (Mousavi et al. 2018 ). In different cloud systems, where resource requirements are estimated at runtime, static load balancing algorithms might not be successful. Dynamic load balancing algorithms, like ESCE and Throttled mechanism, analyse resource requirements and usage during runtime, yet they may result in extra costs and overhead. Traditional algorithms often struggle to scale with the size and complexity of problems. Several articles explore traditional task scheduling algorithms, including Min-min, First come-first serve (FCFS), and Shortest-job-first (SJF). These algorithms are not used often due to their slow processing and time-consuming behaviour. To overcome the issue of conventional methods, a heuristic approach came into the area of research. Kumar and Sharma ( 2018 ) propose a resource provisioning and de-provisioning algorithm that outperforms FCFS, SJF, and Min-min in terms of makespan time and task acceptance ratio. However, the priority of tasks is poorly considered, highlighting a limitation in task allocation strategies. Heuristic algorithms demonstrate remarkable scalability. They are highly suitable for handling large-scale optimisation challenges in various industries, including manufacturing, banking, and logistics, due to their efficiency in locating approximate solutions, even in enormous search spaces (Mishra and Majhi 2020 ). Kumar et al. ( 2018 ) presented another heuristic method named ‘Dynamic Load Balancing Algorithm with Elasticity’, showcasing reduced makespan time and increased task completion ratio. Dubey et al. ( 2018 ) introduced a Modified Heterogeneous Earliest Finish Time (HEFT) algorithm, demonstrating improved server workload distribution to reduce makespan time. While promising, both studies lack comprehensive performance evaluations and limitedly address other Quality of Service (QoS) metrics, such as response time and cost efficiency. Hung et al. ( 2019 ) proposed an Improved Max–min algorithm, achieving the lowest completion and optimal response times. It outperformed the conventional RR, max–min and min-min algorithms.

The development of meta-heuristic algorithms aimed to address the shortcomings of heuristic algorithms, which typically produce approximate rather than ideal solutions. Hybrid techniques have gained traction in recent years, combining heuristic, traditional, and machine-learning approaches. Mousavi et al. ( 2018 ) propose a hybrid technique combining Teaching Learning-Based Optimization (TLBO) and Grey Wolf Optimization (GWO), achieving maximized throughput without falling into local optima. Similarly, Behera and Sobhanayak ( 2024 ) propose a hybrid GWO-GA algorithm, outperforming GWO, GA (Rekha and Dakshayini 2019 ), and PSO in terms of makespan, cost, and energy consumption. Further, we have also discussed the cloud and fog architecture and its working principles in the upcoming sections.

1.2 Motivation for the study

The Industrial Internet of Things (IIoT) has experienced significant advancement and implementation due to the quick progress and use of artificial intelligence techniques. In Industry 5.0, the hyper-automation process involves the deployment of intelligent devices connected to the Industrial Internet of Things (IIoT), cloud computing, smart robots, agile software, and embedded components. These systems can leverage the Industry 5.0 concept, which generates massive amounts of data for hyper-automated communication across cloud computing, digital transformation, human sectors, intelligent robots, and industrial production. Big data management requires cloud and fog technology (Souri et al. 2024 ). Similarly, telemedicine, facilitated by fog computing, has revolutionized the healthcare industry by providing remote access to medical treatments. However, ensuring minimal latency and effective resource utilization are essential for providing high-quality healthcare (Verma et al. 2024 ). Big data in the industrial sector is crucial for predictive maintenance, enabling informed decisions and enhancing task allocation in Industry 4.0, thus necessitating a proficient resource management system (Teoh et al. 2023 ). The growing demand for load balancing in various industries using cloud/fog services prompted us to contemplate and inspired us to compose an evaluation of the escalating necessity for resource management technologies. This review’s core contribution is to provide insights into innovative algorithms, their weaknesses and strengths, used dataset details, simulation tools, research gaps, and future research directions.

1.3 Objectives of the SLR

After a detailed review of the selected studies, we observe the following objectives:

Systematically categorise and identify different load balancing and task scheduling algorithms used in cloud computing.

To address fundamental research questions, such as the effectiveness of different algorithmic approaches, simulation tools, metrics evaluation, etc.

To analyse trends and patterns in the literature, such as the prevalence of Meta-heuristic, Hybrid, and ML-centric approaches, and identify any shifts or emerging paradigms in algorithm design.

To conduct a comparative analysis of the different algorithm categories, identifying strengths, weaknesses, research limitations and trade-offs between them.

Lay the groundwork for future technological advancements by identifying areas where further research and development are needed.

1.4 Research contributions of the SLR

Through this SLR, we have attempted to contribute the following insights, which are based on authentic, selected study material:

We have examined selected articles to identify the research patterns and technological advancements related to resource load balancing in cloud computing. We have devised research questions and attempted to ascertain their solutions.

Using this SLR, we presented a taxonomy of algorithms that provide solutions to the chosen problem.

We provided an in-depth examination of the limitations and advantages of different strategies, along with a thorough comparison study of the techniques discussed in Table  5 , Table  7 , and Table  8 .

We have discussed the performance metrics related to load balancing and task scheduling in the cloud system. We have also explored the simulation tools that the authors in this field prefer.

We have tabulated some benchmarked datasets (Table  6 ) utilized by various authors to achieve several performance metrics.

Finally, we compiled the research gaps and potential areas for future research.

The paper is structured in nine sections, as shown in Fig.  1 above.

figure 1

Various sections and subsections of the SLR

2 Methodology of the systematic literature review

This section lays out the components of a systematic literature review, including the search criteria, review methodology, and research questions. This process involves defining research questions or objectives, identifying relevant databases and sources, and systematically searching and screening for eligible studies. The search term constitutes a string encompassing all essential keywords in the research questions and their corresponding synonyms.

2.1 Search criteria and quality assessment:

The keywords utilized to form the search strings are “load balancing”, “task scheduling”, “cloud computing”, and “machine learning.” To extract relevant papers, the below advanced search query was used in Scopus Database:

figure a

The various computer science publication libraries were manually searched. The SLR search was conducted using the Scopus database, IEEE Computer Society, ResearchGate, Science Direct, Springer, and ACM Digital Archive.

A total of 550 papers were found initially using the above-mentioned advanced query. Then we applied the Inclusion–exclusion criteria provided in Table  2 . Approximately 122 papers were excluded based on having zero citations or requiring purchase to access. We have incorporated cross-referenced studies to obtain a more comprehensive and quality analysis. We manually chose 35 cross-references from the extracted set that strictly adhered to the search criteria to encompass a broader range of reliable studies. A comprehensive selection of 96 papers was finalised, comprising 63 research articles exclusively considered for the technological survey.

2.2 Inclusion–exclusion criteria

The criterion for accepting or rejecting a research paper for the study is explained in Table  2 below.

Data extraction has been performed to capture key information from each study, such as design, methods or techniques, research limitations, future scope, tools, evaluation metrics, and other significant findings. This captured information was then synthesized and analyzed through a systematic and structured approach and placed in a tabular format to provide insights and draw conclusions about the research questions.

2.3 Research questions:

This study aims to search for answers to the following research issues by investigating, comprehending, and evaluating the methods, models, and algorithms utilized to achieve task scheduling and load balancing.

What are the current load balancing and task scheduling techniques commonly used in cloud computing environments?

What are the key factors influencing the performance of load-balancing mechanisms in cloud computing?

Which evaluation matrices are predominantly utilized for assessing the efficacy of load-balancing techniques in cloud computing environments?

Which categories of algorithms are used more in the recent research trend in the cloud computing environment for solving load balancing issues??

Which simulation software tools have garnered prominence in recent scholarly analyses within the domain of cloud computing research?

What insights do the future perspectives within the reviewed literature offer in terms of potential avenues for exploration and advancement within the field?

This next section explores the working principle and architecture of cloud computing, which consists of fog and IoT application layers.

3 Cloud-fog architecture and relevant frameworks

Cloud architecture represents a centralized infrastructure that broadens the scope of cloud computing functionalities towards the network’s edge. It leverages fog computing, an intermediate layer between cloud servers and end devices, to enable real-time processing, data storage, and analytics closer to the data source. Fog nodes, deployed at the network edge, play the role of mediators linking end devices and the cloud, thus reducing latency and bandwidth consumption. These nodes can be physical or virtual entities, such as routers, switches, gateways, or even edge servers.

3.1 Working principles

The working principles of cloud architecture involve collaboration between cloud servers, fog nodes, and end devices, creating a distributed computing environment. An end device initiates a request, which first passes through the nearest fog node. The fog node performs initial processing, filtering, and aggregation of the data before sending a subset of it to the cloud for further analysis or storage. By offloading some processing tasks to the fog nodes, cloud-fog architecture reduces the burden on the cloud, improves response times, and enhances the overall system performance. During task execution, dynamic cloud load balancing techniques assign tasks to virtual machines and adjust the load on these machines based on the system’s conditions. (Tawfeeg et al. 2022 ). Alatoun et al. ( 2022 ) presented an EEIoMT framework for critical task execution in the shortest time in smart medical services while balancing energy consumption with other tasks. The authors have utilized ECG sensors for health monitoring at home. Similarly, Swarna Priya, et al ( 2020 ) have proposed an energy-efficient framework known as the ‘EECloudIoE framework’ for retrieving information from the IoE cloud network. The authors have adopted the ‘Wind Driven optimization algorithm’ to form clusters of sensor nodes in the IoE network. Then, the Firefly algorithm is utilized to select the ‘cluster head’ (CH) for each cluster. Sensor nodes in sensor networks are also used to track physical events in cases of widely dispersed geographic locations. These nodes assist in gathering crucial data from these sites over extended periods; however, they have problems with low battery power. Therefore, it is essential to implement energy-efficient systems using wireless sensor networks to collect this data. Still, cloud computing has some limitations, such as geographical locations of cloud data centers, network connectivity with end nodes, weather conditions, etc. To overcome these issues, Fog computing emerged as a solution. Fog computing acts as an arbitrator between end devices and Cloud Computing, providing storage, networking, and computation services closer to edge devices. The introduction of Edge Computing has brought about the emergence of various computing paradigms, such as Mobile Edge Computing (MEC) and Mobile Cloud Computing (MCC). The MEC primarily emphasizes a 2- or 3-tier application in the network and mobile devices equipped with contemporary cellular base stations. It improves the efficiency of networks by optimizing content distribution and facilitating the creation of applications (Sabireen and Neelanarayanan 2021 ). Figure  2 shows how the cloud, fog, and IoT layers work in collaboration.

figure 2

The fog extends the cloud closer to the devices producing data (Swarna Priya, et al 2020 ; Vergara et al. 2023 )

3.2 Cloud computing layer

Cloud computing facilitates virtualization technology, which combines distributed and parallel processes. Using centralized data centers, it transfers computations from off-premises to on-premises. It has become an advanced technology within the swiftly expanding realm of computing paradigms owing to these two principles: (1) ‘Dynamic Provisioning’ and (2) ‘Virtualization Technology’ (Tripathy et al. 2023 ). Dynamic provisioning is a fundamental concept in the realm of cloud computing. It refers to the automated process of allocating and adjusting computing resources to meet the changing needs of cloud-based applications and services. Virtual network embedding is essential to load balancing in cloud computing as it ensures the mapping of virtual network requests onto physical resources in an effective and balanced manner. By effectively embedding virtual networks onto physical machines, load-balancing algorithms can divide network traffic and workload evenly across the network infrastructure, preventing any single resource from becoming overloaded. Virtual network embedding may be utilized with load-balancing strategies like least connections, weighted round-robin, and round-robin to maximize resource usage and network performance (Apat et al. 2023 ; Santhanakrishnan and Valarmathi 2022 ).

3.3 Fog computing layer

Cisco researchers first used the term fog computing in 2012 to address the shortcomings of cloud computing. To offer fast and reliable services to mobile consumers, fog computing enhances their experiences by introducing a middle fog layer between consumers and the cloud. It is an improvement over cloud-based networking and computing services. The architecture of fog computing consists of a fog server as a fog device or fog node deployed in the proximity of IoT devices to provide resources for different applications. As a promising concept, fog computing introduces a decentralized architecture that enhances data processing capabilities at the network’s edge (Goel and Tiwari 2023 ). However, the limited resources in the fog computing model undoubtedly make it difficult to support several services for these Internet of Things applications. A prompt choice must be made regarding load balancing and application placement in the fog layer due to the diverse and ever-changing nature of application requests from IoT devices. Therefore, it is crucial to allocate resources optimally to maintain service continuity for end customers (Vergara et al. 2023 ). Unlike cloud computing, fog utilizes distributed computing with devices near clients with good computing capacity and diverse organizations for global connectivity. Mahmoud et al. ( 2018 ) introduced a new fog-enabled cloud IoT model by observing that cloud IoT is not the best option in situations where energy usage and latency are important considerations, such as the healthcare sector, where patients need to be monitored in real-time without delay. The energy allocation method used to load jobs into a fog device serves as the foundation for the entire concept. Table 3 presents a comparison between the features of cloud and fog computing paradigms.

3.4 IoT applications layer

Cloud-fog architecture finds applications in various domains, including IoT, healthcare (Alatoun et al. 2022 ), transportation, smart cities, and industrial automation (Dogo et al. 2019 ). Healthcare providers can leverage fog nodes for real-time patient monitoring, while industrial automation systems can benefit from edge analytics for predictive maintenance. Telemedicine, smart agriculture and industry 4.0 and 5.0 are other areas that employ IoT applications. Edge computing and cloud computing have given rise to additional computing paradigms such as mobile edge computing (MEC) and mobile cloud computing (MCC). The MEC primarily emphasizes a network architecture that includes a 2- or 3-tier application, and mobile devices equipped with modern wireless base stations. It improves network efficiency, as well as the dissemination of application content (Sabireen and Neelanarayanan 2021 ).

4 Literature review on load balancing (LB) and task scheduling

We have curated a representative collection of 63 research articles for a technology review. The literature review covers the period from 2014 to 2024. The main target of the LB is to spread the workload on available assets and optimize the overall turnaround time. Before 2014, traditional methods such as FCFS, SJF, MIM-min, Max–min, RR, etc., were recognized for their poor processing speeds and time-consuming job scheduling and load balancing systems. Konjaang et al. ( 2018 ) examine the difficulties associated with the conventional Max–Min algorithm and propose the Expa-Max–Min method as a possible solution. The algorithm prioritizes cloudlets with the longest and shortest execution times to schedule them efficiently. The workload can be divided into memory capacity issues, CPU load, and network load. In the meantime, load balancing techniques, with virtual machine management (VMM), are employed in cloud computing to distribute the load among virtual machines (Velpula et al. 2022 ). In 2019, Hung et al. ( 2019 ) introduced an enhanced max–min algorithm called MMSIA. The objective of the MMSIA algorithm is to improve the completion time in cloud computing by utilizing machine learning to cluster requests and optimize the utilization of virtual machines. The system allocates big requests to virtual machines (VMs) with the lowest utilization percentage, improving processing efficiency. The approach integrates supervised learning into the Max–Min scheduling algorithm to enhance clustering efficiency. Kumar et al. ( 2018 ) state that the updated HEFT algorithm creates a Directed Acyclic Graph (DAG) for all jobs submitted to the cloud. It also assigns computation costs and communication edges across processing resources.

The ordering of tasks is determined by their execution priority, which considers the average time it takes to complete each work on all processors and the expenses associated with communication between predecessor tasks. Subsequently, the tasks are organized in a list according to their decreasing priority and assigned to processors based on the shortest execution time. In the same way, Seth and Singh ( 2019 ) propose the Dynamic Heterogeneous Shortest Job First (DHSJF) model as a solution for work scheduling in cloud computing systems with varying capabilities. The algorithm entails the establishment of a heterogeneous cloud computing environment, the dynamic generation of cloudlet lists, and the analysis of workload and resource heterogeneity to minimize the Makespan. The DHSJF algorithm efficiently schedules dynamic requests to various resources, resulting in optimized utilization of resources. This method overcomes the limitations of the conventional Shortest Job First (SJF) method. A task scheduling process is shown graphically in Fig.  3 .

figure 3

Working of task scheduling in cloud computing

Another technique that many authors increasingly employ is GWO. The GWO technique correlates the duties of grey wolves with viable solutions for distributing jobs or equalizing workloads inside a network or computing system. The Alpha wolves lead the pack, representing the most optimal solution achieved up to this point. The Alpha receives assistance in decision-making and problem-solving from the Beta and Delta wolves, who represent the second and third most optimal alternatives, respectively. The omega wolves, who stand for the remaining solutions, are inspired by the top three wolves. The algorithm represents the exploration and exploitation stages in pursuing the optimal solution through a repetitive process of encircling, hunting, and attacking the target. In 2020, Farrag et al. ( 2020 ) published a work that examines the application of the Ant-Lion optimizer (ALO) and Grey wolf optimizer (GWO) in job scheduling for Cloud Computing. The objective of ALO and GWO is to optimize the makespan of tasks in cloud systems by effectively dividing the workload. Although ALO and GWO surpass the Firefly Algorithm (FFA) in minimizing makespan, their performance relative to PSO varies depending on the specific conditions. Reddy et al. ( 2022 ) introduced the AVS-PGWO-RDA scheme, which utilizes Probabilistic Grey Wolf optimization (PGWO) in the load balancer unit to find the ideal fitness value for selecting user tasks and allocating resources for tasks with lower complexity and time consumption. The AVS approach is employed to cluster related workloads, and the RDA-based scheduler ultimately assigns these clusters to suitable virtual machines (VMs) in the cloud environment. Similarly, Janakiraman and Priya ( 2023 ) introduced the Hybrid Grey Wolf and Improved Particle Swarm Optimization Algorithm with Adaptive Inertial Weight-based multi-dimensional Learning Strategy (HGWIPSOA). This algorithm combines the Grey Wolf Optimization Algorithm (GWOA) with Particle Swarm Optimization (PSO) to efficiently assign tasks to Virtual Machines (VMs) and improve the accuracy and speed of task scheduling and resource allocation in cloud environments. The suggested system effectively tackles the limitations of previous LB approaches by preventing premature convergence and enhancing global search capability. As a result, it provides several benefits, including improved throughput, reduced makespan, reduced degree of imbalance, decreased latency, and reduced execution time. The combination of GWO with GA, as demonstrated by Behera and Sobhanayak ( 2024 ), yields superior results. It provides faster convergence and minimum makespan in large task scheduling scenarios.

At the beginning of 2014, metaheuristic and hybrid-metaheuristic algorithms were used to address cloud computing optimization and load-balancing challenges. Zhan et al. ( 2014 ) suggested a load-aware genetic algorithm called LAGA which is a modified version of the genetic algorithm (GA). LAGA employs the TLB model to optimize makespan and load balance, establishing a new fitness function to find suitable schedules that maintain makespan while maintaining load balance. Rekha and Dakshayini ( 2019 ) introduced a task allocation method for cloud environments that utilizes a Genetic Algorithm. The purpose of this strategy is to minimize job completion time and enhance overall performance. The algorithm considers multiple objectives, such as energy consumption and quick responses, to make the best decisions regarding resource allocation. The evaluation findings exhibit superior throughput using the proposed approach, indicating its efficacy in task allocation decision-making. In 2023, Mishra and Majhi ( 2023 ) proposed a hybrid meta-heuristic technique called GAYA, which combines the Genetic Algorithm (GA) and JAYA algorithm. The purpose of this technique is to efficiently schedule dynamically independent biological data. The GAYA algorithm showcases improved abilities in exploiting and exploring, rendering it a highly viable solution for scheduling dynamic medical data in cloud-based systems. Brahmam and Vijay Anand ( 2024 ) developed a model called VMMISD, where they combined a Genetic Algorithm (GA) with Ant Colony Optimization (ACO) for resource allocation. The system also utilizes combined optimization techniques, iterative security protocols, and deep learning algorithms to enhance the efficiency of load balancing during virtual machine migrations. The model employs K K-means clustering, Fuzzy Logic, Long Short-Term Memory (LSTM) networks, and Graph Networks to anticipate workloads, make decisions, and measure the affinity between virtual machines (VMs) and physical machines. Behera and Sobhanayak ( 2024 ) also proposed a hybrid approach that combines the Grey Wolf Optimizer (GWO) and Genetic Algorithm (GA). The hybrid GWO-GA algorithm effectively reduces makespan, energy consumption, and computing costs, surpassing conventional algorithms in performance. It exhibits accelerated convergence in extensive scheduling problems, offering an edge over earlier techniques.

The combination of autoscaling and reinforcement learning (RL) has garnered significant attention in recent years due to its ability to allocate resources actively in a calm and focused environment (Joshi et al. 2024 ). Deep reinforcement learning (DRL) is a promising technique that automates the process of predicting workloads. DRL may make immediate decisions on resource allocation based on real-time monitoring of the system’s workload and performance parameters to effectively fulfil the system’s present demands. Ran et al. ( 2019 ) introduced a task-scheduling strategy based on deep reinforcement learning (DRL) in 2019. The working of the DRL-based load balancer is shown in Fig.  4 . This method assigns tasks to various virtual machines (VMs) in a dynamic manner, resulting in a decrease in average response time and ensuring load balancing. The technique is examined on a tower server with specific configurations and software tools. It showcases its efficacy in balancing load across virtual machines (VMs) while adhering to service level agreement (SLA) limits. The approach employs deep reinforcement learning (DRL) and deep deterministic policy gradients (DDPG) networks to create optimal scheduling decisions by learning directly from experience without prior knowledge. In addition, Jyoti and Shrimali ( 2020 ) employed DRL in their research and proposed a technique called Multi-agent ‘Deep Reinforcement Learning-Dynamic Resource Allocation’ (MADRL-DRA) in the Local User Agent (LUA) and Dynamic Optimal Load-Aware Service Broker (DOLASB) in the Global User Agent (GUA) to improve the quality of service (QoS) metrics by allocating resources dynamically. The method demonstrates enhanced performance in terms of execution time, waiting time, energy efficiency, throughput, resource utilization, and makespan when compared to traditional approaches. Tong et al. ( 2021 ) present a new technique for task scheduling using deep reinforcement learning (DRL) that aims to reduce the imbalance of virtual machines (VMs) load and the rate of job rejection while also considering service-level agreement limitations. The proposed DDMTS method exhibits stability and outperforms other algorithms in effectively balancing the Degree of Imbalance (DI) and minimizing job rejection rate. The precise configurations of state, action, and reward in the DDMTS algorithm are essential for its efficacy in resolving task scheduling difficulties using the DQN algorithm.

figure 4

Working of load balancer in cloud computing

Double Deep Q-learning has been employed to address load-balancing concerns. Swarup et al. ( 2021 ) introduced a method utilizing Deep Reinforcement Learning (DRL) to address job scheduling in cloud computing. Their approach employs a Clipped Double Deep Q-learning algorithm to minimize computational costs while adhering to resource and deadline constraints. The algorithm employs target network and experience relay techniques to maximize its objective function. The algorithm balances exploration and exploitation by using the e-greedy policy. This policy establishes the approach for selecting actions by considering the trade-off between exploration and exploitation. The system chooses actions randomly for exploration or based on Q-values for exploitation, thus maintaining a balance between attempting new alternatives and utilizing existing ones. In the same way, Kruekaew et al. (Mao et al. ( 2014 ) employ Q-learning to optimize job scheduling and resource utilization. The suggested method, Multi-Objective ABCQ, integrates the Artificial Bee Colony Algorithm with Q-learning to optimize task scheduling, resource utilization, and load balancing in cloud environments. MOABCQ exhibited superior throughput and a higher Average Resource Utilization Ratio (ARUR) than alternative algorithms. Q-learning enhances the efficiency of the ABC algorithm. Figure  5 presents the hybridisation trend of various techniques observed in the literature review.

figure 5

Hybridization trend of some techniques as observed in SLR

Furthermore, the swarm-based technique known as Particle Swarm Optimisation (PSO) is increasingly being adopted by researchers to address challenges related to load balancing in cloud computing. Using PSO, combined with other prominent methods, leads to attaining an ideal solution through extensive investigation and exploration of the search space. Panwar et al. ( 2019 ) introduced a TOPSIS-PSO method designed for non-preemptive task scheduling in cloud systems. The approach tackles task scheduling challenges by employing the TOPSIS method to evaluate tasks according to execution time, transmission time, and cost. Subsequently, optimisation is performed using PSO. The proposed method optimises the Makespan, execution time, transmission time, and cost metrics. In 2020, Agarwal et al. ( 2020 ) introduced a Mutation-based particle swarm Optimization (PSO) algorithm to tackle issues such as premature convergence, decreased convergence speed, and being trapped in local optima. The suggested method seeks to minimise performance characteristics such as Makespan time and enhance the fitness function in cloud computing. In 2021, Negi et al. ( 2021 ) introduced a hybrid load-balancing algorithm in cloud computing called CMODLB. This technique combines machine learning and soft computing techniques. The method employs artificial neural networks, fuzzy logic, and clustering techniques to distribute the workload evenly. The system utilises Bayesian optimization-based augmented K-means for virtual machine clustering and the TOP-SIS-PSO method for work scheduling. VM migration decisions are determined with an interval type 2 fuzzy logic system that relies on load conditions. Although these algorithms demonstrated strong performance, they do not consider the specific type of content used by users. Adil et al. ( 2022 ) found that knowledge about the type of content in tasks can significantly enhance scheduling efficiency and reduce the workload on virtual machines (VMs). The PSO-CALBA system categorises user tasks into several content types, such as video, audio, image, and text, using a Support Vector Machine (SVM) classifier. The categorisation begins by selecting file fragments, which are tasks that consist of diverse file fragments of different content types. The initial classification stage involves utilising the Radial Basis Function (RBF) kernel approach to analyse high-dimensional data, which is a big challenge. Pradhan et al. ( 2022 ) provided a solution for the issue of handling complicated and high-dimensional data in a cloud setting. To address this challenge, they utilised deep reinforcement learning (DRL) and parallel particle swarm optimisation (PSO). The proposed technique synergistically integrates Particle Swarm Optimisation (PSO) and Deep Reinforcement Learning (DRL) to optimise rewards by minimising both makespan time and energy consumption while ensuring high accuracy and fast execution. The algorithm iteratively enhances accuracy, demonstrating superior performance in dynamic environments, and can handle various tasks in cloud environments. Jena et al. ( 2022 ) found that the QMPSO algorithm successfully distributes the workload evenly among virtual machines, resulting in improved makespan, throughput, energy utilisation, and reduced task waiting time. The performance of the hybridisation of modified Particle Swarm Optimisation (MPSO) and improved Q-learning in QMPSO is enhanced by modifying velocity based on the best action generated through Q-learning. The technique employs dynamic resource allocation to distribute tasks among virtual machines (VMs) with varying priorities. This approach aims to minimise task waiting time and maximise VM throughput. This strategy is highly efficient for independent tasks.

Load balancing poses a significant challenge in Fog computing due to limited resources. Talaat et al. ( 2022 ) introduced a method called Effective Dynamic Load Balancing (EDLB) that utilises Convolutional Neural Networks (CNN) and Multi-Objective Particle Swarm Optimisation (MPSO) to optimise resource allocation in fog computing environments to maximise resource utilisation. The EDLB system comprises three primary modules: the Fog Resource Monitor (FRM), the CNN-based Classifier (CBC), and the Optimised Dynamic Scheduler (ODS). The FRM system monitors the utilisation of server resources, while the CBC system classifies fog servers. Additionally, the ODS system allocates incoming tasks to the most appropriate server, reducing response time and enhancing resource utilisation. This strategy effectively decreases response time. Comparably, Nabi et al. ( 2022 ) presented an Adaptive Particle Swarm Optimisation (PSO)-Based Task Scheduling Approach for Cloud Computing, explicitly emphasising achieving load balance and optimisation. The solution incorporates a technique called Linearly Descending and Adaptive Inertia Weight (LDAIW) to improve the efficiency of job scheduling. The methodology employs a population-based scheduling system that draws inspiration from swarm intelligence. In this technique, particles represent solutions, and their updates are determined by factors such as inertia weight, personal best, and global best. The method can reduce task execution time, increase throughput, and better balance local and global search.

Table 4 gives an overview of the advantages and disadvantages of the state-of-the-art techniques. A comparative analysis of state-of-the-art methods on publicly benchmarked datasets is presented in Table  5 .

4.1 Some essential load balancing metrics:

It is evident that meticulous monitoring and analysis of metrics enhance resource utilization, minimize downtime, and ensure a seamless user experience, ultimately boosting overall system reliability and scalability. Several metrics employed for assessing the balance of loads in the cloud are illustrated in Fig.  6 .

Throughput:  In cloud load balancing, throughput refers to the rate at which a cloud infrastructure can process and serve data or requests. Specifically, it represents the amount of work accomplished within a given time frame, reflecting the efficiency of the system’s ability to handle concurrent user demands. High throughput ensures that data or requests can be processed quickly and reliably, minimising latency and optimising resource utilisation. Throughput (t p ) can be calculated by using the mathematical formula given in Eq. ( 1 ) below:

where n is the number of tasks, and ExT is the execution time of the j th task.

Makespan: Makespan denotes the overall duration needed to finish a specific set of tasks or jobs within a cloud computing environment. Minimum makespan represents the efficiency and performance of the system in handling and processing tasks. It can be calculated with the help of the following formula:

In equation ( 2 ), ExT j is the execution time of the j th virtual machine. A robust and efficient load balancing algorithm has minimum Makespan time.

Response time: Response time is when a user makes a request and when the cloud infrastructure delivers a response. Minimizing response time is crucial to providing a seamless user experience and ensuring optimal performance.

Reliability:  It indicates the system’s ability to effectively handle failures, prevent downtime, and maintain continuous service availability. To detect and mitigate failures promptly, ensure seamless failover mechanisms, and provide continuous and reliable service to users even in the event of disruptions or high load conditions.

Migration time:  Migration time refers to the duration required to transfer workloads or applications from one server or data center to another within the cloud infrastructure. It encompasses the process of migrating virtual machines, containers, or services to optimize resource allocation and handle changes in demand.

Bandwidth:  It represents the capacity or available channel for data communication. It also refers to the maximum data capacity that may be transferred across a network connection within a specific period. Adequate bandwidth is essential for efficient load balancing, as it ensures the smooth and timely flow of data between servers and clients.

Resource utilization:  It refers to the efficient allocation and management of computing resources within a cloud infrastructure to meet the demands of varying workloads. It involves optimizing the utilization of servers, storage, network bandwidth, and other resources to maximize performance and minimize waste. It can be measured with the help of a mathematical formula, as given in Eq. ( 3 ):

figure 6

Classification of load balancing algorithms

In equation ( 3 ), R es U is the resource utilization of the k th virtual machine (VM); CTjk is the completion time of the j th  job on the k th  VM.

Energy consumption:  It can be defined as the ability of a cloud infrastructure to optimize its power consumption while maintaining optimal performance. It reduces energy consumption by dynamically allocating computing resources and powering down underutilized servers during low-demand periods. By minimizing power usage, cloud load balancing systems contribute to reducing carbon footprints, operational costs, and environmental impact while ensuring sustainable and eco-friendly operations in cloud computing environments.

Fault tolerance: A system can continue functioning uninterrupted in the presence of failures or errors. It involves designing load-balancing algorithms and mechanisms that can withstand and recover from various faults, such as server failures, network outages, or traffic spikes (Tawfeeg et al. 2022 ).

4.2 Taxonomy of load balancing algorithms and challenges associated with them

Mishra and Majhi ( 2020 ) have categorized the load balancing algorithms into four broad classes: Traditional, Heuristic, Meta-heuristic, and Hybrid. The authors have also explained the subcategories of meta-heuristic and hybrid algorithms based on their nature. Tawfeeg et al. ( 2022 ) have discussed three main categories of load-balancing algorithms, namely static, dynamic, and hybrid. Tripathy et al. ( 2023 ) mentioned in their review that the load-balancing algorithm based on their environment is generally classified into three main classes: static, dynamic, and nature-inspired. In this systematic review paper, we have tried to include the maximum range of algorithms by covering all the categories and sub-categories. Figure  6 represents all categories of load-balancing algorithms (Table  6 ).

Traditional Algorithms: Traditional algorithms are mainly classified into preemptive and non-preemptive. Preemptive means to forcefully stop an ongoing execution to serve a higher-priority task. After the completion of the execution of a higher-priority job, the preempted job is resumed. The priority of the task can be internal or external. Traditional algorithms commonly employed for load balancing include Round Robin (RR), Weighted Round Robin, Least Connection, and Weighted Least Connection. Round Robin assigns requests cyclically to each server, ensuring an equal distribution. Weighted Round Robin provides scalability by considering server weights and allocating a proportionate number of requests to each server based on its capabilities and performance (Praditha, et al. 2023 ). The Least connection (LC) algorithm assigns requests to the server with the fewest active connections, promoting load distribution efficiency. The Weighted Least Connection (WLC) enhances the previous algorithm by considering server weights. It assigns requests to servers with the least active connections, scaling the distribution based on server capabilities. Preemptive scheduling algorithms include round-robin and priority-based. Non-preemptive algorithms include Shortest Job First (SJF) and First Come First Serve (FCFS).

Heuristic-based Algorithms: Heuristic algorithms are problem-solving techniques that rely on practical rules, intuition, and experience rather than precise mathematical models. These are used to find approximate solutions in a reasonable amount of time. The heuristic algorithms aim to distribute workload efficiently among cloud and fog nodes. Compared to hybrid and meta-heuristic algorithms, heuristic algorithms are relatively straightforward and have reduced computational complexity. They often provide reasonable solutions but lack guarantees of optimality. There are two types of heuristic techniques: static and dynamic. When a task’s estimated completion time is known, the static heuristic is used. When tasks arrive dynamically, a dynamic heuristic can be applied. Algorithms like Min-min, Max-min (Mao et al. 2014 ), RASA, Modified Heterogeneous Earliest Finish Time (HEFT) (Dubey et al. 2018 ), Improved Max-min (Hung et al. 2019 ) and DHSJF (Seth and Singh 2019 ) are the prominent examples of the heuristic category.

Meta-heuristic based algorithms: Meta-heuristic algorithms are good at finding a global solution without falling into local optima. A meta-heuristic algorithm is a problem-solving technique that guides the search process by iteratively refining potential solutions. It is used to find approximate solutions for complex optimization problems, especially in cloud computing, where traditional algorithms often struggle due to the inherent complexity and dynamic nature of the environment. A particular meta-heuristic algorithm that has proven effective in cloud computing is the Genetic Algorithm (GA) (Rekha and Dakshayini 2019 ). GA mimics the process of natural selection, evolving a population of solutions to find strong candidates. By employing genetic operators like selection, crossover, and mutation, GA explores the solution space intelligently, adapting to changing conditions and providing near-optimal solutions for resource allocation, task scheduling, and load balancing in cloud computing environments. Other examples from the reviewed literature are GWO (Reddy et al. 2022 ), ACO (Dhaya and Kanthavel 2022 ), TBSLB PSO (Ramezani et al. 2014 ), TOPSIS-PSO (Konjaang et al. 2018 ), and Modified BAT (Latchoumi and Parthiban 2022 ). When two meta-heuristic methods are combined the new method is a hybrid meta-heuristic. An example of a hybrid metaheuristic is Ant Colony Optimization with Particle Swarm (ACOPS) (Cho et al. 2015 ).

Hybrid based algorithms: The hybrid algorithms integrate the advantages of centralized and distributed load-balancing algorithms to achieve better performance and scalability. It leverages the centralized approach to monitor and collect real-time information about the system’s state, workload, and resource availability (Geetha et al. 2024 ). Simultaneously, it incorporates distributed load-balancing techniques to efficiently divide the workload among fog nodes. This hybrid approach enhances the overall load-balancing efficiency, reduces network congestion, and improves the system’s response time. By dynamically adapting to changing workload patterns and resource availability, the hybrid algorithm ensures optimal resource utilization and enhances user satisfaction. A hybrid method that combines the Genetic Algorithm (GA) and the Grey Wolf Optimization Algorithm (GWO) is proposed by Behera and Sobhanayak ( 2024 ). The hybrid GWO-GA algorithm minimizes cost, energy usage, and Makespan. Similarly, other examples from the literature review are GAYA (Mishra and Majhi 2023 ), VMMSID (Brahmam and Vijay Anand 2024 ), DTSO-TS (Ledmi et al. 2024 ), etc.

ML-Centric algorithms: These algorithms combine machine learning facilities with existing algorithms to automate the function. This is one of the latest approaches in the research area and has proven to be the best way to deal with real-time-based scenarios. To address the challenges of load balancing, researchers have been increasingly focusing on machine-learning-centric algorithms. ML-based algorithms offer promising results in load balancing by dynamically allocating tasks based on workload characteristics and resource availability. These algorithms leverage ML techniques such as reinforcement learning, deep learning, and clustering to intelligently predict and allocate the workload across cloud fog computing environments. ML-centric algorithms deliver improved performance, reduced response time, and enhanced resource utilization by continuously learning from historical data and adapting to changing conditions. Furthermore, these algorithms also consider energy consumption and network traffic factors, ensuring a holistic load-balancing approach (Muchori and Peter 2022 ). Examples of ML-centric algorithms from reviewed literature are DRL (Ran et al. 2019 ), MADRL-DRA (Jyoti and Shrimali 2020 ), TS-DT (Mahmoud et al. 2022 ), FF-NWRDLB (Prabhakara et al. 2023 ) etc.

Table 7 provides a comprehensive overview of recent load balancing and task scheduling algorithms, presenting information on the technology proposed, comparing technologies, research limitations, results, tools used, and potential future directions. Additionally, Table  8 outlines the evaluation metrics, advantages/disadvantages of the technologies reviewed, and objectives of the study.

5 Applications areas of load balancing in cloud and fog computing

There are various areas of applications where load balancing is very crucial. The healthcare sector is one area where efficient resource utilization and load balancing are highly desirable. According to Mahmoud et al. ( 2018 ), Fog computing integrated with IoT-based healthcare architecture improves latency, energy consumption, mobility, and Quality of Service, enabling efficient healthcare services regardless of location. Fog-enabled Cloud-of-Things (CoT) system models with energy-aware allocation strategies result in more energy-efficient operations, which are crucial for healthcare applications sensitive to delays and energy consumption. Yong et al. ( 2016 ) propose a dynamic load balancing approach using SDN technology in a cloud data center, enabling real-time monitoring of service node flow and load state, as well as global resource assignment for uneven system load distribution. Dogo et al. ( 2019 ) introduced a mist computing system for better connectivity and resource utilization of smart cities and industries. According to the authors, Mist computing enables smart cities to intelligently adapt to dynamic events and changes, enhancing urban operations. Mist computing is more suitable for realizing smart city solutions where streets adapt to different conditions, promoting energy conservation and efficient operations. Similarly, Sharif et al. ( 2023 ) presented a paper that discusses the rapid growth of IoT devices and applications, emphasizing the need for efficient task scheduling and resource allocation in edge computing for health surveillance systems. The proposed Priority-based Task Scheduling and Resource Allocation (PTS-RA) mechanism aims to manage emergency conditions efficiently, meeting latency-sensitive tasks’ requirements with reduced bandwidth cost. On the same track, Aqeel, et al. ( 2023 ) proposed a CHROA model that can be utilized for energy-efficient and intelligent load balancing in cloud-enabled IoT environments, particularly in healthcare, where real-time applications generate large volumes of data. Sah Tyagi et al. ( 2021 ) presented a neural network-based resource allocation model for an energy-efficient WSN-based smart Agri-IoT framework. The model improves dynamic clustering and optimizes cluster size. The approach combines the use of BPNN (Backpropagation Neural Network), APSO (Adaptive Particle Swarm Optimization), and BNN (Binary Neural Network) to accomplish the effective allocation of agricultural resources. This integration showcases notable progress in cooperative networking and overall optimization of resources . In the same manner, Dhaya and Kanthavel ( 2022 ) emphasize the importance of energy efficiency in agriculture, and the challenges in resource allocation, and introduce a novel algorithm ‘Naive Multi-Phase Resource Allocation Algorithm’ to enhance energy efficiency and optimize agricultural resources effectively in a dynamic environment. In this way, there are several application areas where load balancing and resource scheduling is crucial. In future, transportation, industry 4.0 and 5.0, IoT network systems, Smart cities, smart agriculture, and healthcare systems will be hotspots for research on load balancing. The following are the areas where resource allocation and utilization are critical, and where cloud service utilization is highest:

Telemedicine (Verma et al. 2024 )

Industry 4.0 and Industry 5.0 (Teoh et al. 2023 )

Healthcare system (Talaat et al. 2022 )

Agriculture (Agri-IoT) (Dhaya and Kanthavel 2022 ; Sah Tyagi et al. 2021 )

Real-time monitoring services (Yong et al. 2016 )

Smart cities (Alam 2021 )

Digital twining (Zhou et al. 2022 ; Adibi et al. 2024 )

Smart business and analytics (Nag et al. 2022 )

E-commerce (Sugan and Isaac Sajan 2024 )

6 Research queries and inferences

After a detailed literature review, the answers to the research questions have been inferred successfully without any bias or by adding views from researchers. Below, the inferences drawn are given in the form of answers:

We elucidate the answers to the research questions below to provide a thorough understanding based on the examination of existing material.

Q1. What load balancing and task scheduling techniques are commonly used in cloud computing environments?

This SLR divides the current techniques into five categories: traditional, heuristic, meta-heuristic, ML-centric, and hybrid. We employed the content analysis method to determine the category of each technique used in the literature study, as shown in Table  7 . From the literature review, it has been inferred that hybrid, meta-heuristic and ML-centric algorithms/techniques are researchers’ favourite choices for solving load-balancing issues in a cloud computing system. The percentage-wise utilization of various techniques is depicted in Fig.  7 . In the future, ML/DL-based load-balancing algorithms will be the hotspot for researchers as there is an emerging trend of hybridising ML-centric approaches with existing ones.

figure 7

Percentage-wise utilisation of various categories of load balancing algorithm from 2014 to 2024 based on SLR

7 What are the key factors influencing the performance of load-balancing mechanisms in cloud computing?

The performance of load balancing in the cloud is influenced by several aspects, including the availability of resources such as CPU, memory, storage, and network bandwidth, the nature of the workload, network latency, the load balancer algorithm, and the health of the server as well as fault detection and tolerance. The selection of the load balancing algorithm can significantly influence performance, as different algorithms vary in complexity and efficiency, affecting how resources are distributed. In cases of server overload or issues, the load balancer must be able to identify these problems and redirect traffic to other servers to maintain optimal performance.

Q3. Which evaluation matrices are predominantly utilized for assessing the efficacy of load-balancing techniques in cloud computing environments?

The utilization trend of various metrics over the period 2014–2024 is shown graphically in Fig.  8 . We have employed the frequency analysis method to determine the year-wise utilization of each performance metric. Table 8 provides an in-depth analysis of the performance metrics attained in every study. The year-wise categorization of each metric is shown in Table  9 . The metrics most frequently used to gauge load balancing in cloud computing environments are Makespan, resource utilization (RU), Degree of Imbalance (DI), cost efficiency, throughput, and execution time. Evaluation metrics like fault tolerance, QoS, reliability and migration rate require additional attention without compromising other factors. The row named ‘other’ in Table  9 includes parameters like convergence speed, network longevity, fitness function, packet loss ratio, success rate, task scheduling efficiency, scalability, clustering phase duration, standard deviation of load, accuracy, precision and time complexity.

figure 8

The analysis of performance metrics used in load balancing based on SLR

Q4. Which categories of algorithms have been used more in recent research trends in the cloud computing environment for solving load-balancing issues?

According to Fig.  9 , it is inferred that the researchers prefer using hybrid algorithms for addressing load balancing and task scheduling problems in cloud computing. This preference arises because hybrid algorithms combine the functionalities of various algorithms, resulting in a precise and multi-objective solution to task scheduling and load-balancing challenges. During the period 2014, a heuristic approach was commonly used, but meta-heuristic approaches later replaced it. By 2022, the hybrid approach had become the dominant method. Interestingly, many of these hybrid techniques incorporate machine learning techniques to combine with other optimization methods.

figure 9

Year-wise utilisation trend of various techniques used in load balancing

Q5. Which simulation software tools have garnered prominence in recent scholarly analyses within the domain of cloud computing research?

Figure  10 shows that 51% of the researchers use the CloudSim tool for simulation purposes, followed by Python with 11%. We have employed the frequency analysis method to quantify and compare the utilization of different simulation tools within each study. According to the literature review, the CloudSim simulation tool is the first choice of researchers, with 51% utilization and has been used more in the last few years. It allows users to model and simulate cloud computing infrastructure, resource provisioning policies, and application scheduling algorithms. The CloudSim simulation tool is an external framework that is available for download and can be imported into various programming software options like Eclipse, NetBeans IDE, Maven, etc. To simulate the cloud computing environment, the CloudSim toolkit has been explicitly integrated with NetBeans IDE 8.2 and Windows 10 as operating systems (Vergara et al. 2023 ).

figure 10

Analysis of simulation tools based on SLR

Q6. What insights do the future perspectives within the reviewed literature offer in terms of potential avenues for exploration and advancement within the field?

According to this article, the future directions of this field focus on developing more advanced algorithms that harness the potential of machine learning and deep learning, enabling enhanced energy efficiency and overall system performance in cloud computing environments. Real-time monitoring and automation of systems using the AI approach are also hot topics to explore in future research. The future scopes recorded during the literature review are shown in Table  7 .

All the responses in this study are deduced and documented based on the above literature review. It is important to note that these responses are impartial and not generated by the researchers.

8 Statistical analysis

The SLR attempts a bibliographic analysis to understand the development and present condition of research in various domains and investigates the dissemination of scholarly materials, which can unveil both dominant patterns and possible deficiencies within the academic body of work. We used the Scopus academic database to collect important information based on the keywords “load balancing and task scheduling in cloud computing using machine learning”. A total of 129 items were found. This analysis centres on this dataset of 129 items, illustrating the distribution of documents published in many critical subject areas. It offers valuable insights into the current priorities and interests of the academic community.

These publications are distributed across various subjects, providing insights into the interdisciplinary nature of this field, as shown in Fig.  11 .

figure 11

Subject-wise analysis of publications from 2014 to 2024 related to used keywords

9 Discussion

Our extensive literature study has discovered valuable insights and emerging trends crucial for advancing cloud computing technology. This discussion summarizes the research findings, answering the initial research questions and making conclusions based on a thorough examination of chosen studies conducted between 2014 and 2024.

9.1 Research gaps

Most research efforts are concentrated on a specific aspect of load balancing. Many systems are limited to either data center or network load balancing. There is an urgent necessity to address multiple aspects.

Load balancing is a single point-of-failure issue. Furthermore, most of the research concentrates solely on a limited number of performance parameters, such as Makespan, throughput, completion time, etc. The degree of Imbalance (DI) is a crucial parameter to work on.

There is a significant need to enhance quality measures such as QoS (Quality of Service), fault tolerance, network delay, VM (Virtual Machine) migration and risk assessment.

The integration of fog and edge computing to mitigate the requirement for massive amounts of data transfer. This will improve the flexibility and usefulness of cloud computing in multiple sectors.

Finally, the power conservation mechanism has not been given much thought by the researchers. There is a shortage of innovative thinking in power conservation when it comes to load balancing.

Geographical barriers impose network delay and data transmission delay issues. We need to focus on the development of cutting-edge technologies to overcome distance-related and delay-related issues (Muchori and Peter 2022 ).

Virtual Machine Migrations (VMM) is also a challenge that highly impacts the efficacy of cloud services. There is a dire need for design technologies that allow fewer VM migrations.

Despite the advancements, applying machine learning algorithms in cloud computing is complicated. The intricacy of these algorithms, combined with the requirement for extensive training data, presents substantial obstacles. The dynamic nature of cloud environments requires constant learning and adjustment of these models, which raises questions about their ability to handle large-scale operations and maintain long-term viability.

9.2 Integration of machine learning for enhanced load balancing and task scheduling

One key insight from this analysis is a growing reliance on machine learning methods to enhance load balancing and task scheduling processes. Although somewhat successful, conventional algorithms generally struggle in dynamic cloud systems where data and workload patterns continuously change. Due to their capacity to acquire knowledge and adjust accordingly, machine learning algorithms have demonstrated potential in forecasting workload patterns, enabling the implementation of more effective resource allocation strategies. This enhances efficiency and substantially decreases execution time and energy consumption, aligning with the objectives of achieving optimal resource utilisation and high system throughput (Janakiraman and Priya 2023 ; Edward Gerald et al. 2023 ).

9.3 Future directions

The future of cloud computing rests on advancing auto-adaptive systems capable of independently handling load balancing and task scheduling without human involvement. Fusing artificial intelligence (AI) and cloud computing can create systems that provide unparalleled efficiency and reliability. Creating efficient cloud services could be significantly improved by developing lightweight machine learning models that require minimum training data and can quickly adapt to changing conditions. Moreover, investigating unsupervised learning algorithms can potentially eliminate the requirement for large, labelled data, enhancing the application’s practicality. These are some of the most frequently observed future scopes based on this SLR:

Deployment of deep learning (DL) and machine learning (ML) techniques to predict load patterns: The predictive analysis of workload patterns can prevent resource underutilization or overloading. We can also use ML to reduce energy consumption and predict faults in cloud computing (Reddy et al. 2022 ; Mishra and Majhi 2023 ; Agarwal et al. 2020 ; Negi et al. 2021 ; Latchoumi and Parthiban 2022 ; Shuaib, et al. 2023 ).

Development of fault tolerance techniques integrated with load balancing: Only a small number of research studies examine security concerns on cloud computing services, like load balancing and fault tolerance, without elaborating on the connection between the two (Behera and Sobhanayak 2024 ; Tawfeeg et al. 2022 ; Brahmam and Vijay Anand 2024 ).

To extend the existing techniques for data security and privacy by incorporating blockchain technology with cloud computing (Edward Gerald et al. 2023 ; Saba et al. 2023 ; Li et al. 2020 ).

To achieve more QoS metrics such as scalability, elasticity, and applicability to cover extensive domains, is also scoped to extend research work (Adil et al. 2022 ; Talaat et al. 2022 ; Sultana et al. 2024 ).

Most of the researchers have focused on the energy consumption aspect. Future research should aim to achieve energy efficiency as energy is going to be one of the scantiest resources in future (Rekha and Dakshayini 2019 ; Farrag et al. 2020 ; Panwar et al. 2019 ; Mahmoud et al. 2022 ; Asghari and Sohrabi 2021 ).

To achieve cost-effectiveness and real-time load balancing are prominent research areas. Most of the researchers have plans to extend their work to real-time analytics and dynamic cloud networks (Kumar and Sharma 2018 ; Ni et al. 2021 ).

Response delays in real-time applications are crucial. Real-time analytics in a complex and dynamic environment is a hotspot for researchers. Healthcare systems, telemedicine domains and real-time monitoring or surveillance services are examples of delay-sensitive applications (Verma et al. 2024 ; Pradhan et al. 2022 ; Nabi et al. 2022 ; Shahakar et al. 2023 ).

Dynamic reallocation of dependent tasks is another scope for future research. Task priority-based scheduling optimize the cloud performance (Ran et al. 2019 ; Jena et al. 2022 ; Prabhakara et al. 2023 ).

Fog and edge computing Architectures have limited resources, and optimal resource scheduling is essential. Many authors have also discussed resource scheduling in fog and edge computing as a potential future area of study (Swarup et al. 2021 ; Kruekaew and Kimpan 2022 ).

This SLR records the future research scopes mentioned above, and Table  7 provides detailed information.

10 Conclusion

The study of the computational cloud is vast and comes with numerous challenges. It allows end users to access computational processes, leading to many individuals’ widespread use of cloud services. This widespread adoption has made cloud computing an essential part of various businesses, notably online shopping sites. This increased usage has put more strain on cloud resources like hardware, software, and network devices. Consequently, we need load-balancing solutions for efficient utilization of these resources. This SLR categorizes technologies into five classes: conventional/traditional, heuristic, meta-heuristic, ML-Centric, and Hybrid. Traditional approaches are time-consuming, slow, and often stuck in local optima. Traditional algorithms struggle to scale with problem size and complexity, leading to slow processing and time-consuming behavior. Heuristic algorithms, which demonstrate remarkable scalability, are suitable for large-scale optimization challenges in industries like manufacturing, banking, and logistics. Heuristic algorithms often produce approximate answers rather than perfect ones; consequently, meta-heuristic algorithms emerged to address these drawbacks. In recent years, hybrid strategies, which combine heuristic, conventional, and machine-learning approaches, have become increasingly popular. These approaches aim to utilize the advantages of several algorithms to overcome limitations and improve performance. This systematic literature review conducted on efficient load balancing and task scheduling in a cloud computing environment has provided valuable insights into different algorithms, research limitations, evaluation metrics, challenges, simulation tools, and potential future directions. The analysis has demonstrated that the current trend in the cloud computing environment involves the utilization of ML-centric and hybrid algorithms to address load balancing and job/task scheduling issues effectively. Furthermore, the findings indicate a growing interest among researchers in ML-centric techniques, showcasing a shift towards incorporating ML/DL approaches. Our study explained the fundamental structure of cloud computing and its operational principles. A comprehensive examination of evaluation metrics and simulation tools is conducted impartially. Lastly, we addressed the research questions that formed the basis of this literature review, providing well-supported answers derived from the information gathered. This systematic review is a foundational resource for future scopes in this domain. It offers valuable information to researchers and practitioners involved in the domain of load balancing in cloud computing architecture. Additionally, this SLR does not delve into specific aspects concerning security and privacy considerations or issues related to load balancing. This will be retained as a topic for future investigation on our part. Table 10 provides abbreviations for several terms.

Data availability

No datasets were generated or analysed during the current study.

Adibi S, Rajabifard A, Shojaei D, Wickramasinghe N (2024) Enhancing healthcare through sensor-enabled digital twins in smart environments: a comprehensive analysis. Sensors. https://doi.org/10.3390/s24092793

Article   Google Scholar  

Adil M, Nabi S, Raza S (2022) PSO-CALBA: Particle swarm optimization based content-aware load balancing algorithm in cloud computing environment. Comput Inform 41(5):1157–1185. https://doi.org/10.31577/cai_2022_5_1157

Adil M, Nabi S, Aleem M, Diaz VG, Lin JC-W (2023) CA-MLBS: content-aware machine learning based load balancing scheduler in the cloud environment. Expert Syst. https://doi.org/10.1111/exsy.13150

Agarwal R, Baghel N, Khan MA (2020) Load balancing in cloud computing using mutation based particle swarm optimization. In: presented at the 2020 International Conference on Contemporary Computing and Applications, IC3A 2020, pp 191–195 https://doi.org/10.1109/IC3A48958.2020.233295

Alahmad Y, Agarwal A (2024) Multiple objectives dynamic VM placement for application service availability in cloud networks. J Cloud Comput. https://doi.org/10.1186/s13677-024-00610-2

Alam T (2021) Cloud-based iot applications and their roles in smart cities. Smart Cities 4(3):1196–1219. https://doi.org/10.3390/smartcities4030064

Alatoun K, Matrouk K, Mohammed MA, Nedoma J, Martinek R, Zmij P (2022) A novel low-latency and energy-efficient task scheduling framework for internet of medical things in an edge fog cloud system. Sensors. https://doi.org/10.3390/s22145327

Apat HK, Nayak R, Sahoo B (2023) A comprehensive review on internet of things application placement in Fog computing environment. InteRnet Things Neth. https://doi.org/10.1016/j.iot.2023.100866

Aqeel I et al (2023) Load balancing using artificial intelligence for cloud-enabled internet of everything in healthcare domain. Sensors. https://doi.org/10.3390/s23115349

Asghari A, Sohrabi MK (2021) Combined use of coral reefs optimization and reinforcement learning for improving resource utilization and load balancing in cloud environments. Computing 103(7):1545–1567. https://doi.org/10.1007/s00607-021-00920-2

Behera I, Sobhanayak S (2024) Task scheduling optimization in heterogeneous cloud computing environments: a hybrid GA-GWO approach. J Parallel Distrib Comput. https://doi.org/10.1016/j.jpdc.2023.104766

Biswas D, Dutta A, Ghosh S, Roy P (2024) future trends and significant solutions for intelligent computing resource management, pp 187–208 https://doi.org/10.4018/979-8-3693-1552-1.ch010

Brahmam MG, Vijay Anand R (2024) VMMISD: an efficient load balancing model for virtual machine migrations via fused metaheuristics with iterative security measures and deep learning optimizations. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3373465

Buyya R et al (2018) A manifesto for future generation cloud computing: research directions for the next decade. ACM Comput Surv. https://doi.org/10.1145/3241737

Cho K-M, Tsai P-W, Tsai C-W, Yang C-S (2015) A hybrid meta-heuristic algorithm for VM scheduling with load balancing in cloud computing. Neural Comput Appl 26(6):1297–1309. https://doi.org/10.1007/s00521-014-1804-9

Dhaya R, Kanthavel R (2022) Energy efficient resource allocation algorithm for agriculture IoT. Wirel Pers Commun 125(2):1361–1383. https://doi.org/10.1007/s11277-022-09607-z

Dogo EM, Salami AF, Aigbavboa CO, Nkonyana T (2019) Taking cloud computing to the extreme edge: a review of mist computing for smart cities and industry 4.0 in Africa. In: EAI/Springer Innovations in Communication and Computing, pp 107–132 https://doi.org/10.1007/978-3-319-99061-3_7

Dubey K, Kumar M, Sharma SC (2018) Modified HEFT algorithm for task scheduling in cloud environment. In: presented at the Procedia Computer Science, pp 725–732 https://doi.org/10.1016/j.procs.2017.12.093

Edward Gerald B, Geetha P, Ramaraj E (2023) A fruitfly-based optimal resource sharing and load balancing for the better cloud services. Soft Comput 27(10):6507–6520. https://doi.org/10.1007/s00500-023-07873-y

Farrag AAS, Mohamad SA, El-Horbaty ESM (2020) Swarm optimization for solving load balancing in cloud computing. In: presented at the advances in intelligent systems and computing, pp 102–113 https://doi.org/10.1007/978-3-030-14118-9_11

Geetha P, Vivekanandan SJ, Yogitha R, Jeyalakshmi MS (2024) Optimal load balancing in cloud: Introduction to hybrid optimization algorithm. Expert Syst Appl. https://doi.org/10.1016/j.eswa.2023.121450

Goel G, Tiwari R (2023) Resource scheduling techniques for optimal quality of service in fog computing environment: a review. Wirel Pers Commun 131(1):141–164. https://doi.org/10.1007/s11277-023-10421-4

Hashem W, Nashaat H, Rizk R (2017) Honey bee based load balancing in cloud computing. KSII Trans Internet Inf Syst 11(12):5694–5711. https://doi.org/10.3837/tiis.2017.12.001

Hung TC, Hy PT, Hieu LN, Phi NX (2019) MMSIA: improved max-min scheduling algorithm for load balancing on cloud computing. In: presented at the ACM International Conference Proceeding Series, pp 60–64 https://doi.org/10.1145/3310986.3311017

Huo L, Shao P, Ying F, Luo L (2019) The research on task scheduling algorithm for the cloud management platform of mimic common operating environment. In: presented at the Proceedings - 2019 18th International Symposium on Distributed Computing and Applications for Business Engineering and Science, DCABES 2019, pp 167–171 https://doi.org/10.1109/DCABES48411.2019.00049

Jalalian Z, Sharifi M (2022) A hierarchical multi-objective task scheduling approach for fast big data processing. J Supercomput 78(2):2307–2336. https://doi.org/10.1007/s11227-021-03960-9

Janakiraman S, Priya MD (2023) Hybrid grey wolf and improved particle swarm optimization with adaptive intertial weight-based multi-dimensional learning strategy for load balancing in cloud environments. Sustain Comput Inform Syst. https://doi.org/10.1016/j.suscom.2023.100875

Jena UK, Das PK, Kabat MR (2022) Hybridization of meta-heuristic algorithm for load balancing in cloud computing environment. J King Saud Univ Comput Inf Sci 34(6):2332–2342. https://doi.org/10.1016/j.jksuci.2020.01.012

Joshi S, Panday N, Mishra A (2024) Reinforcement learning based auto scaling strategy used in cloud environment: State of Art, p 736 https://doi.org/10.1109/CSNT60213.2024.10545922

Jyoti A, Shrimali M (2020) Dynamic provisioning of resources based on load balancing and service broker policy in cloud computing. Clust Comput 23(1):377–395. https://doi.org/10.1007/s10586-019-02928-y

Khodar A, Chernenkaya LV, Alkhayat I, Fadhil Al-Afare HA, Desyatirikova EN (2020) Design model to improve task scheduling in cloud computing based on particle swarm optimization. In: presented at the Proceedings of the 2020 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering, EIConRus 2020, pp 345–350 https://doi.org/10.1109/EIConRus49466.2020.9039501

Kiruthiga G, Maryvennila S (2020) Robust resource scheduling with optimized load balancing using grasshopper behavior empowered intuitionistic fuzzy clustering in cloud paradigm. Int J Comput Netw Appl 7(5):137–145. https://doi.org/10.22247/ijcna/2020/203851

Konjaang JK, Ayob FH, Muhammed A (2018) Cost effective Expa-Max-Min scientific workflow allocation and load balancing strategy in cloud computing. J Comput Sci 14(5):623–638. https://doi.org/10.3844/jcssp.2018.623.638

Kruekaew B, Kimpan W (2022) Multi-objective task scheduling optimization for load balancing in cloud computing environment using hybrid artificial bee colony algorithm with reinforcement learning. IEEE Access 10:17803–17818. https://doi.org/10.1109/ACCESS.2022.3149955

Kumar M, Dubey K, Sharma SC (2018) Elastic and flexible deadline constraint load Balancing algorithm for Cloud Computing. In: presented at the Procedia Computer Science, pp 717–724 https://doi.org/10.1016/j.procs.2017.12.092

Kumar M, Sharma SC (2018) Deadline constrained based dynamic load balancing algorithm with elasticity in cloud environment. Comput Electr Eng 69:395–411. https://doi.org/10.1016/j.compeleceng.2017.11.018

Latchoumi TP, Parthiban L (2022) Quasi oppositional dragonfly algorithm for load balancing in cloud computing environment. Wirel Pers Commun 122(3):2639–2656. https://doi.org/10.1007/s11277-021-09022-w

Ledmi A, Ledmi M, Souidi MEH, Haouassi H, Bardou D (2024) Optimizing task scheduling in cloud computing using discrete tuna swarm optimization. Ing Syst Inf 29(1):323–335. https://doi.org/10.18280/isi.290132

Li X, Qin Y, Zhou H, Chen D, Yang S, Zhang Z (2020) An intelligent adaptive algorithm for servers balancing and tasks scheduling over mobile fog computing networks. Wirel Commun Mob Comput. https://doi.org/10.1155/2020/8863865

Liu X, Qiu T, Wang T (2019) Load-balanced data dissemination for wireless sensor networks: a nature-inspired approach. IEEE Internet Things J 6(6):9256–9265. https://doi.org/10.1109/JIOT.2019.2900763

Mahmoud MME, Rodrigues JJPC, Saleem K, Al-Muhtadi J, Kumar N, Korotaev V (2018) Towards energy-aware fog-enabled cloud of things for healthcare. Comput Electr Eng 67:58–69. https://doi.org/10.1016/j.compeleceng.2018.02.047

Mahmoud H, Thabet M, Khafagy MH, Omara FA (2022) Multiobjective task scheduling in cloud environment using decision tree algorithm. IEEE Access 10:36140–36151. https://doi.org/10.1109/ACCESS.2022.3163273

Mao Y, Chen X, Li X (2014) Max–min task scheduling algorithm for load balance in cloud computing. Adv Intell Syst Comput 255:457–465. https://doi.org/10.1007/978-81-322-1759-6_53

Mishra K, Majhi SK (2020) A state-of-art on cloud load balancing algorithms. Int J Comput Digit Syst 9(2):201–220. https://doi.org/10.12785/IJCDS/090206

Mishra K, Majhi SK (2023) A novel improved hybrid optimization algorithm for efficient dynamic medical data scheduling in cloud-based systems for biomedical applications. Multimed Tools Appl 82(18):27087–27121. https://doi.org/10.1007/s11042-023-14448-4

Mousavi S, Mosavi A, Varkonyi-Koczy AR (2018) A load balancing algorithm for resource allocation in cloud computing. In: presented at the advances in intelligent systems and computing, pp 289–296 https://doi.org/10.1007/978-3-319-67459-9_36

Muchori J, Peter M (2022) Machine learning load balancing techniques in cloud computing: a review. Int J Comput Appl Technol Res 11:179–186. https://doi.org/10.7753/IJCATR1106.1002

Nabi S, Ahmad M, Ibrahim M, Hamam H (2022) AdPSO: adaptive PSO-based task scheduling approach for cloud computing. Sensors. https://doi.org/10.3390/s22030920

Nag A, Sen M, Saha J (2022) Integration of predictive analytics and cloud computing for mental health prediction. In: Predictive Analytics in Cloud, Fog, and Edge Computing: Perspectives and Practices of Blockchain, IoT, and 5G, pp 133–160 https://doi.org/10.1007/978-3-031-18034-7_8

Neelakantan P, Yadav NS (2023) An optimized load balancing strategy for an enhancement of cloud computing environment. Wirel Pers Commun 131(3):1745–1765. https://doi.org/10.1007/s11277-023-10520-2

Negi S, Rauthan MMS, Vaisla KS, Panwar N (2021) CMODLB: an efficient load balancing approach in cloud computing environment. J Supercomput 77(8):8787–8839. https://doi.org/10.1007/s11227-020-03601-7

Ni L, Sun X, Li X, Zhang J (2021) GCWOAS2: multiobjective task scheduling strategy based on gaussian cloud-whale optimization in cloud computing. Comput Intell Neurosci. https://doi.org/10.1155/2021/5546758

Oduwole O, Akinboro S, Lala O, Fayemiwo M, Olabiyisi S (2022) Cloud computing load balancing techniques: retrospect and recommendations. FUOYE J Eng Technol 7:17–22. https://doi.org/10.46792/fuoyejet.v7i1.753

Pabitha P, Nivitha K, Gunavathi C, Panjavarnam B (2024) A chameleon and remora search optimization algorithm for handling task scheduling uncertainty problem in cloud computing. Sustain Comput Inform Syst. https://doi.org/10.1016/j.suscom.2023.100944

Pang S, Zhang W, Ma T, Gao Q (2017) Ant colony optimization algorithm to dynamic energy management in cloud data center. Math Probl Eng. https://doi.org/10.1155/2017/4810514

Panwar N, Negi S, Rauthan MMS, Vaisla KS (2019) TOPSIS–PSO inspired non-preemptive tasks scheduling algorithm in cloud environment. Clust Comput 22(4):1379–1396. https://doi.org/10.1007/s10586-019-02915-3

Prabhakara BK, Naikodi C, Suresh L (2023) Ford fulkerson and Newey West regression based dynamic load balancing in cloud computing for data communication. Int J Comput Netw Inf Secur 15(5):81–95. https://doi.org/10.5815/IJCNIS.2023.05.08

Pradhan A, Bisoy SK, Kautish S, Jasser MB, Mohamed AW (2022) Intelligent decision-making of load balancing using deep reinforcement learning and parallel PSO in cloud environment. IEEE Access 10:76939–76952. https://doi.org/10.1109/ACCESS.2022.3192628

Praditha VS et al (2023) A Systematical review on round robin as task scheduling algorithms in cloud computing. In: presented at the 2023 6th International Conference on Information and Communications Technology, ICOIACT 2023, pp 516–521 https://doi.org/10.1109/ICOIACT59844.2023.10455832

Prashanth SK, Raman D, (2021) Optimized dynamic load balancing in cloud environment using B+ Tree. In: presented at the Advances in Intelligent Systems and Computing, pp 391–401 https://doi.org/10.1007/978-981-33-4859-2_39

Ramezani F, Lu J, Hussain FK (2014) Task-based system load balancing in cloud computing using particle swarm optimization. Int J Parallel Prog 42(5):739–754. https://doi.org/10.1007/s10766-013-0275-4

Ran L, Shi X, Shang M (2019) SLAs-aware online task scheduling based on deep reinforcement learning method in cloud environment. In: presented at the Proceedings - 21st IEEE International Conference on High Performance Computing and Communications, 17th IEEE International Conference on Smart City and 5th IEEE International Conference on Data Science and Systems, HPCC/SmartCity/DSS 2019, pp 1518–1525 https://doi.org/10.1109/HPCC/SmartCity/DSS.2019.00209

Reddy KL, Lathigara A, Aluvalu R, Viswanadhula UM (2022) PGWO-AVS-RDA: An intelligent optimization and clustering based load balancing model in cloud. Concurr Comput Pract Exp. https://doi.org/10.1002/cpe.7136

Rekha PM, Dakshayini M (2019) Efficient task allocation approach using genetic algorithm for cloud environment. Clust Comput 22(4):1241–1251. https://doi.org/10.1007/s10586-019-02909-1

Rostami S, Broumandnia A, Khademzadeh A (2024) An energy-efficient task scheduling method for heterogeneous cloud computing systems using capuchin search and inverted ant colony optimization algorithm. J Supercomput 80(6):7812–7848. https://doi.org/10.1007/s11227-023-05725-y

Saba T, Rehman A, Haseeb K, Alam T, Jeon G (2023) Cloud-edge load balancing distributed protocol for IoE services using swarm intelligence. Clust Comput 26(5):2921–2931. https://doi.org/10.1007/s10586-022-03916-5

Sabireen H, Neelanarayanan V (2021) A Review on Fog computing: architecture, Fog with IoT, algorithms and research challenges. ICT Express 7(2):162–176. https://doi.org/10.1016/j.icte.2021.05.004

Sah Tyagi SK, Mukherjee A, Pokhrel SR, Hiran KK (2021) An intelligent and optimal resource allocation approach in sensor networks for smart Agri-IoT. IEEE Sens J 21(16):17439–17446. https://doi.org/10.1109/JSEN.2020.3020889

Santhanakrishnan M, Valarmathi K (2022) Load balancing techniques in cloud environment - a big picture analysis, p 310 https://doi.org/10.1109/ICCST55948.2022.10040387

Seth S, Singh N (2019) Dynamic heterogeneous shortest job first (DHSJF): a task scheduling approach for heterogeneous cloud computing systems. Int J Inf Technol Singap 11(4):653–657. https://doi.org/10.1007/s41870-018-0156-6

Shafiq DA, Jhanjhi N, Abdullah A (2019) Proposing a load balancing algorithm for the optimization of cloud computing applications. In: presented at the MACS 2019 - 13th International Conference on Mathematics, Actuarial Science, Computer Science and Statistics, Proceedings https://doi.org/10.1109/MACS48846.2019.9024785

Shahakar M, Mahajan S, Patil L (2023) Load balancing in distributed cloud computing: a reinforcement learning algorithms in heterogeneous environment. Int J Recent Innov Trends Comput Commun 11(2):65–74. https://doi.org/10.17762/ijritcc.v11i2.6130

Shakkeera L, Tamilselvan L (2016) QoS and load balancing aware task scheduling framework for mobile cloud computing environment. Int J Wirel Mob Comput 10(4):309–316. https://doi.org/10.1504/IJWMC.2016.078201

Sharif Z, Tang Jung L, Ayaz M, Yahya M, Pitafi S (2023) Priority-based task scheduling and resource allocation in edge computing for health monitoring system. J King Saud Univ Comput Inf Sci 35(2):544–559. https://doi.org/10.1016/j.jksuci.2023.01.001

Shetty S, Shetty S (2019) Analysis of load balancing in cloud data centers. J Ambient Intell Humaniz Comput 15:1–9. https://doi.org/10.1007/s12652-018-1106-7

Shuaib M et al (2023) An optimized, dynamic, and efficient load-balancing framework for resource management in the internet of things (IoT) environment. Electron SwiTz. https://doi.org/10.3390/electronics12051104

Souri A, Norouzi M, Alsenani Y (2024) A new cloud-based cyber-attack detection architecture for hyper-automation process in industrial internet of things. Clust Comput 27(3):3639–3655. https://doi.org/10.1007/s10586-023-04163-y

Sugan J, Isaac Sajan R (2024) PredictOptiCloud: A hybrid framework for predictive optimization in hybrid workload cloud task scheduling. Simul Model Pract Theory. https://doi.org/10.1016/j.simpat.2024.102946

Sultana Z, Gulmeher R, Sarwath A (2024) Methods for optimizing the assignment of cloud computing resources and the scheduling of related tasks. Indones J Electr Eng Comput Sci 33(2):1092–1099. https://doi.org/10.11591/ijeecs.v33.i2.pp1092-1099

Swarna Priya RM et al (2020) Load balancing of energy cloud using wind driven and firefly algorithms in internet of everything. J Parallel Distrib Comput 142:16–26. https://doi.org/10.1016/j.jpdc.2020.02.010

Swarup S, Shakshuki EM, Yasar A (2021) Task scheduling in cloud using deep reinforcement learning. In: presented at the Procedia Computer Science, pp 42–51 https://doi.org/10.1016/j.procs.2021.03.016

Talaat FM, Ali HA, Saraya MS, Saleh AI (2022) Effective scheduling algorithm for load balancing in fog environment using CNN and MPSO. Knowl Inf Syst 64(3):773–797. https://doi.org/10.1007/s10115-021-01649-2

Tawfeeg TM et al (2022) Cloud dynamic load balancing and reactive fault tolerance techniques: a systematic literature review (SLR). IEEE Access 10:71853–71873. https://doi.org/10.1109/ACCESS.2022.3188645

Teoh YK, Gill SS, Parlikad AK (2023) IoT and Fog-computing-based predictive maintenance model for effective asset management in industry 4.0 using machine learning. IEEE Internet Things J 10(3):2087–2094. https://doi.org/10.1109/JIOT.2021.3050441

Tong Z, Deng X, Chen H, Mei J (2021) DDMTS: a novel dynamic load balancing scheduling scheme under SLA constraints in cloud computing. J Parallel Distrib Comput 149:138–148. https://doi.org/10.1016/j.jpdc.2020.11.007

Tripathy SS et al (2023) State-of-the-art load balancing algorithms for mist-fog-cloud assisted paradigm: a review and future directions. Arch Comput Methods Eng 30(4):2725–2760. https://doi.org/10.1007/s11831-023-09885-1

Ullah A, Chakir A (2022) Improvement for tasks allocation system in VM for cloud datacenter using modified bat algorithm. Multimed Tools Appl 81(20):29443–29457. https://doi.org/10.1007/s11042-022-12904-1

Vasile M-A, Pop F, Tutueanu R-I, Cristea V, Kołodziej J (2015) Resource-aware hybrid scheduling algorithm in heterogeneous distributed computing. Future Gener Comput Syst 51:61–71. https://doi.org/10.1016/j.future.2014.11.019

Velpula P, Pamula R, Jain PK, Shaik A (2022) Heterogeneous load balancing using predictive load summarization. Wirel Pers Commun 125(2):1075–1093. https://doi.org/10.1007/s11277-022-09589-y

Vergara J, Botero J, Fletscher L (2023) A comprehensive survey on resource allocation strategies in fog/cloud environments. Sensors. https://doi.org/10.3390/s23094413

Verma R, Singh PD, Singh KD, Maurya S (2024) Dynamic load balancing in telemedicine using genetic algorithms and fog computing. In: presented at the AIP Conference Proceedings https://doi.org/10.1063/5.0223933

Walia R, Kansal L, Singh M, Kumar KS, Mastan Shareef RM, Talwar S (2023) Optimization of load balancing algorithm in cloud computing. In: presented at the 2023 3rd International Conference on Advance Computing and Innovative Technologies in Engineering, ICACITE 2023, pp 2802–2806 https://doi.org/10.1109/ICACITE57410.2023.10182878

Yong W, Xiaoling T, Qian H, Yuwen K (2016) A dynamic load balancing method of cloud-center based on SDN. China Commun 13(2):130–137. https://doi.org/10.1109/CC.2016.7405731

Zhan ZH, Zhang GY, Gong YJ, Zhang J (2014) Load balance aware genetic algorithm for task scheduling in cloud computing. Lect. Notes Comput. Sci. Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinforma. 8886: pp. 644–655 https://doi.org/10.1007/978-3-319-13563-2_54 .

Zhou X et al (2022) Intelligent small object detection for digital twin in smart manufacturing with industrial cyber-physical systems. IEEE Trans Ind Inform 18(2):1377–1386. https://doi.org/10.1109/TII.2021.3061419

Download references

Acknowledgements

We would like to express our gratitude and appreciation to Rabdan Academy Abu Dhabi UAE for their generous support and funding that made this research possible. Their contribution has been invaluable in enabling us to carry out this work to a high standard.

There is no funding associated with this work.

Author information

Authors and affiliations.

Department of Computer Science & Applications, Maharshi Dayanand University, Rohtak, Haryana, India

Nisha Devi & Sandeep Dalal

Department of CSE, UIET, Maharshi Dayanand University, Rohtak, Haryana, India

Kamna Solanki

Department of Computer Science and Engineering, Amity University Haryana, Gurugram, India

Surjeet Dalal

Department of Computer Science and Engineering, Galgotias University, Greater Noida, UP, India

Umesh Kumar Lilhore & Sarita Simaiya

Department of Spectrum Management, Afghanistan Telecommunication Regulatory Authority, Kabul, 2496300, Afghanistan

Nasratullah Nuristani

You can also search for this author in PubMed   Google Scholar

Contributions

SD & UKL: Design and methods, KS & SSD: Conclusion and review of the first draft, SS & NN: Introduction and background, SSD & KS: Results and analysis, NM& NN: Discussion and review of the final draft, NN & SD: Conceptualization and corresponding authors.. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Surjeet Dalal or Nasratullah Nuristani .

Ethics declarations

Competing interests.

The authors declare no competing interests.

Additional information

Publisher's note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/ .

Reprints and permissions

About this article

Devi, N., Dalal, S., Solanki, K. et al. A systematic literature review for load balancing and task scheduling techniques in cloud computing. Artif Intell Rev 57 , 276 (2024). https://doi.org/10.1007/s10462-024-10925-w

Download citation

Accepted : 22 August 2024

Published : 05 September 2024

DOI : https://doi.org/10.1007/s10462-024-10925-w

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Cloud computing
  • Task scheduling
  • Load balancing
  • Machine learning
  • Optimization techniques
  • Find a journal
  • Publish with us
  • Track your research

IEEE Account

  • Change Username/Password
  • Update Address

Purchase Details

  • Payment Options
  • Order History
  • View Purchased Documents

Profile Information

  • Communications Preferences
  • Profession and Education
  • Technical Interests
  • US & Canada: +1 800 678 4333
  • Worldwide: +1 732 981 0060
  • Contact & Support
  • About IEEE Xplore
  • Accessibility
  • Terms of Use
  • Nondiscrimination Policy
  • Privacy & Opting Out of Cookies

A not-for-profit organization, IEEE is the world's largest technical professional organization dedicated to advancing technology for the benefit of humanity. © Copyright 2024 IEEE - All rights reserved. Use of this web site signifies your agreement to the terms and conditions.

parallel and distributed computing research paper

Academia.edu no longer supports Internet Explorer.

To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to  upgrade your browser .

  •  We're Hiring!
  •  Help Center

Parallel & Distributed Computing

  • Most Cited Papers
  • Most Downloaded Papers
  • Newest Papers
  • Last »
  • Grid Computing Follow Following
  • Cloud Computing Follow Following
  • Parallel Processing Follow Following
  • Parallel Algorithms Follow Following
  • Distributed Computing Follow Following
  • Parallel Computing Follow Following
  • Parallel Architectures Follow Following
  • Cloud Follow Following
  • Parallel Programming Follow Following
  • Heterogeneous Distributed Systems Follow Following

Enter the email address you signed up with and we'll email you a reset link.

  • Academia.edu Journals
  •   We're Hiring!
  •   Help Center
  • Find new research papers in:
  • Health Sciences
  • Earth Sciences
  • Cognitive Science
  • Mathematics
  • Computer Science
  • Academia ©2024

COMMENTS

  1. Journal of Parallel and Distributed Computing

    The publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems. The journal also features special issues on these topics; again covering the full range from the design to the use of our targeted systems.

  2. Advances in parallel and distributed computing and its applications

    The selected papers of this special issue cover a variety of interesting topics reflecting some recent developments in theoretical and practical research in both core and interdisciplinary areas of parallel and distributed computing, applications, and technologies.

  3. Towards a Scalable and Efficient PGAS-based Distributed OpenMP

    MPI+X has been the de facto standard for distributed memory parallel programming. It is widely used primarily as an explicit two-sided communication model, which often leads to complex and error-prone code. Alternatively, PGAS model utilizes efficient one-sided communication and more intuitive communication primitives. In this paper, we present a novel approach that integrates PGAS concepts ...

  4. Distributed Systems and Parallel Computing

    We continue to face many exciting distributed systems and parallel computing challenges in areas such as concurrency control, fault tolerance, algorithmic efficiency, and communication. Some of our research involves answering fundamental theoretical questions, while other researchers and engineers are engaged in the construction of systems to ...

  5. Parallel and Distributed Computing: Algorithms and Applications

    This Topical Collection is focused on all algorithmic aspects of parallel and distributed computing and applications. Essentially, every scenario where multiple operations or tasks are executed at the same time is within the scope of this Topical Collection.

  6. Journal of Parallel and Distributed Computing

    This international journal is directed to researchers, engineers, educators, managers, programmers, and users of computers who have particular interests in parallel processing and/or distributed computing. The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design ...

  7. Distributed, Parallel, and Cluster Computing

    Our comprehensive framework integrates hardware, software, workflows, and user interfaces to foster a synergistic environment for quantum and classical computing research. This paper outlines plans to unlock new computational possibilities, driving forward scientific inquiry and innovation in a wide array of research domains.

  8. Parallel and Distributed Computing

    Grid computing is an emerging field for the next generation of parallel distributed computing platform for solving large scale computational and data intensive problems. In general, the grid users ...

  9. PDF Journal of Parallel and Distributed Computing

    My research areas include network optimization and algorithms, cyber security, network protocols, cloud and edge (fog) computing, distributed storage, and wireless networks.

  10. Journal of Parallel and Distributed Computing

    This special issue invites research manuscripts extended from those presented at IEEE International Confe-rence on High Performance Computing, Data, & Analytics (HiPC2020) which was held virtually, December 16--18, 2020. The accepted papers will cover traditional areas of the high-performance computing, data science and analytics domains as well as emerging topics in these domains.

  11. Quantum Algorithms and Simulation for Parallel and Distributed Quantum

    When designing quantum algorithms for such a distributed quantum computer, one can make use of the added parallelization and distribution abilities inherent in the system. An added difficulty to then overcome for distributed quantum computing is that a complex control system to orchestrate the various components is required.

  12. PDF Future Directions for Parallel and Distributed Computing

    Future challenges and research directions within Parallel and Distributed Computing will cut across disciplines to deliver integrated hardware and software systems to address complete application so-lutions.

  13. Fast, Accurate and Distributed Simulation of novel HPC systems

    short-paper. Free access. Share on. Fast, Accurate and Distributed Simulation of novel HPC systems incorporating ARM and RISC-V CPUs ... HPDC '24: Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing. June 2024. 436 pages. ISBN: 9798400704130. DOI: 10.1145/3625549. Chair: Patrizio Dazzi, Co ...

  14. Constrained Approximate Query Processing with Error and Response Time

    In this paper, we reflect on the state of the art of Approximate Query Processing. ... Although much technical progress has been made in this area of research, we are yet to see its impact on products and services. ... Proceedings of the 33rd International Symposium on High-Performance Parallel and Distributed Computing. June 2024. 436 pages ...

  15. Guide for authors

    The Journal of Parallel and Distributed Computing publishes original research papers and timely review articles on the theory, design, evaluation, and use of parallel and/or distributed computing systems.

  16. PDF Efficient Parallel Computing for Machine Learning at Scale

    Efficient Parallel Computing for Machine Learning at Scale. ne Learning at Scaleby Arissa WongpanichResearch ProjectSubmitted to the Department of Electrical Engineering and Computer Sciences, University of California at Berkeley, in partial satisfaction of the. Approval for the Report and Comprehensive Examination:

  17. A systematic literature review for load balancing and task ...

    The criterion for accepting or rejecting a research paper for the study is explained in Table ... which combines distributed and parallel processes. Using centralized data centers, it transfers computations from off-premises to on-premises. ... A manifesto for future generation cloud computing: research directions for the next decade. ACM ...

  18. Introduction—Parallel and Distributed Computing

    THE research domains of parallel and distributed computing have a significant overlap. With the advent of general-purpose multiprocessors, this overlap is bound to increase. This Special Issue attempts to draw together several papers from both of these separate research domains to illustrate commonalty and to encourage greater interaction among researchers in the two communities.

  19. PDF Journal of Parallel and Distributed Computing

    His research interests include parallel programming, software transactional memory, and distributed architectures. Jie Xu (Member, IEEE) is currently a chair profes-sor of computing with the University of Leeds, di-rector of UK EPSRC WRG e-Science Centre, and chief scientist of BDBC, Beihang University, China.

  20. Research on Parallel Computing Teaching: state of the art and future

    This research full paper identifies how the teaching of parallel computing has been developing over the years. The learning of parallel and distributed computing is fundamental for computing professionals, due to the popularization of parallel architectures. Teaching parallel computing involves theoretical concepts and the development of practical skills. Its content is dense and comprises ...

  21. PDF Journal of Parallel and Distributed Computing

    His research interests include computer architecture, computer storage sys-tems and parallel/distributed computing. He has over 200 publications in major journals and international Conferences in these

  22. Call for papers

    In the pursuit of scalable and efficient computing for MAI, various research areas and techniques need to be investigated, including hardware acceleration, high-performance computing (HPC), and parallel and distributed computing approaches for multi-modal data and models.

  23. PDF Basic Parallel and Distributed Computing Curriculum

    In this paper, we present a basic educational scenario on how to give a consistent and efficient background in parallel computing to ordinary computer scientists and engineers.

  24. PARALLEL AND DISTRIBUTED COMPUTING

    Parallel computing is a methodology where we distribute one single process on multiple processors. Every single processor executes a portion of the program simultaneously and once execution ...

  25. PDF Parallel Algorithms

    In addition to parallel algorithms, this chapter has also touched on several related subjects, including the modeling of parallel computations, parallel computer architecture, and parallel pro-gramming languages.

  26. Pervasive parallel and distributed computing in a liberal arts college

    We present a model for incorporating parallel and distributed computing (PDC) throughout an undergraduate CS curriculum. Our curriculum is designed to introduce students early to parallel and distributed computing topics and to expose students to these topics repeatedly in the context of a wide variety of CS courses.

  27. Parallel & Distributed Computing Research Papers

    This paper presents a comparative analysis of the three widely used parallel sorting algorithms: OddEven sort, Rank sort and Bitonic sort in terms of sorting rate, sorting time and speed-up on CPU and different GPU architectures.... more. Download.

  28. [2409.00876] Rapid GPU-Based Pangenome Graph Layout

    Computational Pangenomics is an emerging field that studies genetic variation using a graph structure encompassing multiple genomes. Visualizing pangenome graphs is vital for understanding genome diversity. Yet, handling large graphs can be challenging due to the high computational demands of the graph layout process. In this work, we conduct a thorough performance characterization of a state ...