multicpu bzip2 using a python script -
I want to add several hundred gigabytes of data quickly to bzip2 using my 8 core, 16GB RAM workstation. Currently I am using a simple python script to insert a complete directory tree by using bzip2 and an os.system call.
I see that bzip2 only uses one CPU while the other CPU is relatively inactive.
I am a newbie and threaded process in the queue but I am wondering how can I implement it that I can run four bzip2 threads (in fact I can guess the OS system threads), each The experiment is hardly the end of the CPU, which ends the files in the queue because they catch them.
My single thread script has been pasted here.
Use the subprocess
module to generate multiple processes at once. If n of them are running (n must be slightly larger than the CPU number, say 3 for 2 core, 8 to 8), wait for one to finish and then start each other.
Note that this can not be of great help because there will be a lot of disk activity which you can not parallel. A lot of free RAM helps to cache.
Comments
Post a Comment