TY - GEN
T1 - CUDA-based Acceleration of full waveform inversion on GPU
AU - Wang, Baoli
AU - Gao, Jinghuai
AU - Zhang, Huanlan
AU - Zhao, Wei
N1 - Publisher Copyright:
© 2011 SEG.
PY - 2011
Y1 - 2011
N2 - Computational cost and storage requirement are the main obstacles that inhibit the research and practical application of full waveform inversion (FWI). We have developed a fast parallel scheme to speed up FWI on graphics processing unit (GPU),which is a parallel computing device, via CUDA(an acronym for Compute Unified Device Architecture), developed by NVIDA and used as the programming environment. In this parallel scheme, to avoid frequent and low-bandwidth data transfer between host memory and device memory, almost the entire computing task, including propagator and backpropagator, are coded as a sequence of kernel functions that can be called from the compute host for each iterative inversion. The random boundaries conditions are used when propagating source wavefield to solve the storage requirement so that we do not have to save any additional wavefield data and the noise introduced into final inversion image is so weak that can be ignored due to iterations. To test our algorithm, we implement the FWI on Personal Computer (PC) with GTX480 GPU to reconstruct the Marmousi velocity model using synthetic data generated by the finite-difference time domain code. This numerical test indicates that the GPU-based FWI typically is 80 times faster than the CPU-based implementation.
AB - Computational cost and storage requirement are the main obstacles that inhibit the research and practical application of full waveform inversion (FWI). We have developed a fast parallel scheme to speed up FWI on graphics processing unit (GPU),which is a parallel computing device, via CUDA(an acronym for Compute Unified Device Architecture), developed by NVIDA and used as the programming environment. In this parallel scheme, to avoid frequent and low-bandwidth data transfer between host memory and device memory, almost the entire computing task, including propagator and backpropagator, are coded as a sequence of kernel functions that can be called from the compute host for each iterative inversion. The random boundaries conditions are used when propagating source wavefield to solve the storage requirement so that we do not have to save any additional wavefield data and the noise introduced into final inversion image is so weak that can be ignored due to iterations. To test our algorithm, we implement the FWI on Personal Computer (PC) with GTX480 GPU to reconstruct the Marmousi velocity model using synthetic data generated by the finite-difference time domain code. This numerical test indicates that the GPU-based FWI typically is 80 times faster than the CPU-based implementation.
UR - https://www.scopus.com/pages/publications/85055496526
M3 - 会议稿件
AN - SCOPUS:85055496526
SN - 9781618391841
T3 - Society of Exploration Geophysicists International Exposition and 81st Annual Meeting 2011, SEG 2011
SP - 2528
EP - 2533
BT - Society of Exploration Geophysicists International Exposition and 81st Annual Meeting 2011, SEG 2011
PB - Society of Exploration Geophysicists
T2 - Society of Exploration Geophysicists International Exposition and 81st Annual Meeting 2011, SEG 2011
Y2 - 18 September 2011 through 23 September 2011
ER -