pandas parallelize for loop 1

pandas parallelize for loop

import multiprocessing as mp
pool = mp.Pool(processes=mp.cpu_count())

def func( arg ):
    idx,row = arg
	# Edit df
    return row

new_rows = pool.map( func, [(idx,row) for idx,row in data_all.iterrows()])
data_all_new = pd.concat( new_rows )

Here is what the above code is Doing:
1. Create a pool of workers.
2. Create a list of arguments to pass to the function.
3. Map the function to the list of arguments.
4. Concatenate the results.

Similar Posts