(please note that this entire question although addressing parrallel programming, is largely framed under the context/applications of python 3.X)
At the moment, what I gather from reading is that:
a process, is a set of instructions, along with all the resources that accompany it while it is running. It would include the following code, as well as the input/output/resource/memory/filehandle/etc. In other words, its the whole kitchen sink.
# this script, while running as a whole, is considered a process
print('hello world')
with open('something.txt', 'a') as file_handle:
for i in range(500):
file_handle.write('blablabla')
print('job done!')
However, if I wanted to do more in the same amount of time - in order to maximize my computers processing power - I have the option to spawn more processes or threads. Which one do I choose, compared to the simple python script process analogy above, what would they be? Is spawning another process the equivalent of just recalling the entire thing again while changing the filename?
# changed filename (is this "another process?")
print('hello world')
with open('something_else.txt', 'a') as file_handle:
for i in range(500):
file_handle.write('blablabla')
print('job done!')
I also get the vague idea that a single process can contain multiple threads, would it just be the equivalent of loading a bunch of more "conceptual" for loops then?
# like would this be a "thread" a barebones "subset" of an entire program?
with open('something.txt', 'a') as file_handle:
for i in range(500):
file_handle.write('blablabla')
How are the two really different from one and another anyways? Searching online I get the idea that processes are more independent and heavyweight, while threads are more lightweight and "easier to share memory with each other." But what does this really mean? Why cant processes share memory with each other too? And if threads can "share memory" how come I cant access differing variables from differing threads that are spawned from the same script (e.g. like from thread_a import var_data)
Lastly, what computes what exactly? Does a CPU compute threads or processes? Or is it an overarching term encompassing multiple cores/etc. Do cores compute processes or threads?
Summary:
1) Using a simple python script as an example for a process, what would the equivalent of spawning another process/thread be? (e.g. duplicate script/subset of a script/some section of code only)
2) how are processes fundamentally different from threads, what is an example of processes being able to do something that threads cannot?
3) why is memory/data often described as "harder to share" in processes than threads? and how do threads share data anyways?
4) Do CPU's compute threads or processes. Do cores compute threads or processes
5) What are some general guidelines/examples of when to use what