Thinking Allowed

medical / technology / education / art / flub

Constructing Transformers For Longer Sequences with Sparse Attention Methods

2021-04-14 23:38:27

"We show that carefully designed sparse attention can be as expressive and flexible as the original full attention model. Along with theoretical guarantees, we provide a very efficient implementation which allows us to scale to much longer inputs. As a consequence, we achieve state-of-the-art results for question answering, document summarization and genome fragment classification."

Source: ai.googleblog.com

attention sparse longer fragment answering guarantees sequences constructing

About

Welcome to my blog. I'm a physician, educationalist, digital innovator, and medical affairs professional. Coder and founder OutcomesEngine.com. This is also the home of The Crap Artist (Official) blog posts. Dr Dean Jenkins FRCP.

Note. This is a personal blog for sharing and reflecting on my own learning. Any discussion on health matters is as accurate and comprehensive as possible but only general - it is not a substitute for the individual advice you may recieve from your own doctor. Other doctors reading this blog should use their own clinical judgement when interpreting the information and deciding how best to apply it to the care of patients.