Govur University Logo
--> --> --> -->
...

What is the primary advantage of using self-attention over recurrent neural networks for capturing long-range dependencies?



The primary advantage of self-attention over recurrent neural networks (RNNs) in capturing long-range dependencies is its ability to directly access and weigh the importance of *anyword in the input sequence when processing a given word. RNNs, like LSTMs (Long Short-Term Memory) and GRUs (Gated Recurrent Units), process the input sequentially, meaning they read the input one word at a time, maintaining a hidden state that summarizes the information seen so f....

Log in to view the answer



Redundant Elements