pandas parallelize for loop
import multiprocessing as mp pool = mp.Pool(processes=mp.cpu_count()) def func( arg ): idx,row = arg # Edit df return row new_rows = pool.map( func, [(idx,row) for idx,row in data_all.iterrows()]) data_all_new = pd.concat( new_rows )
Here is what the above code is Doing:
1. Create a pool of workers.
2. Create a list of arguments to pass to the function.
3. Map the function to the list of arguments.
4. Concatenate the results.