posted on 2024-12-01, 00:00authored byFederico Ghiglione
Decentralized learning is a type of Machine Learning architecture where data is distributed among
several nodes. With this approach, each node has its own dataset, usually a fraction of the whole
dataset, and it can train a model with its own data without the need to communicate with the other
nodes.
One of the most popular algorithms used in this field is Random Walk Learning (RWL). In RWL, we have
a single model that is sent to a node chosen randomly among all the nodes in the network; the node
will train the model for a certain number of epochs( usually one epoch) and then send it to another
node chosen randomly among its neighbors which will repeat the same process.
In this way, the model can learn from different nodes during the learning process, but it faces some
challenges, such as scenarios where the data distribution among the nodes is heterogenous, leading
to accuracy issues because the accuracy could depend on the path chosen from the first node to the
last one.
A new algorithm called Random Walking Spiders(RWS) has been introduced to solve these issues. The
main difference between RWL and RWS is that in RWS, we have more models trained simultaneously
instead of just one, like in RWL.
Each model is initially sent to a random node and is trained with the local dataset, like in RWL, for a
certain number of epochs (usually one epoch). After the local training in the node, all the models are
sent to the same node, and an average of the model parameters is performed there, so every model
becomes equal to each other, and then each model is sent to a new node chosen randomly, and the
process repeats itself. The idea behind this new algorithm is to learn from more nodes simultaneously
instead of just one at a time, as in RWL.
The RWS algorithm can be implemented in two different variants: non-adaptive RWS and adaptive
RWS. The non-adaptive RWS is the process described before where the modes are trained in different
nodes, then averaged, and passed to other nodes. The adaptive-RWS is an update of the non-adaptive
case where we have just one model at the beginning since it's enough in the first epochs, while after
certain conditions are met, it is split into more models.
In order to test the algorithm, the CIFAR-10 and FASHION_MNSIT public datasets were used, the
results show the superiority of both the non-adaptive RWS and the adaptive over the RWL algorithm in
terms of accuracy. While the non-adaptive RWS is better in the long term, the adaptive RWS is better
also in short-medium scenarios. This performance improvement makes this algorithm a significant
advancement in the decentralized learning field.
No human subjects were used to accomplish this task.
History
Advisor
Erdem Koyuncu
Department
Computer Science
Degree Grantor
University of Illinois Chicago
Degree Level
Masters
Degree name
Master of Science
Committee Member
Cornelia Caragea
Pedram Rooshenas
Eros Gian Alessandro Pasero (Politecnico di Torino)