You did not agree with C++Concurrency is not there yet (which is fair enough) but you did not tell me why. Your answers were non-C++ related, yes? Am I missing something?(?)
No, these are C++ related. Making use of multiple cores/multiple CPUs comes down to the compiler and the OS. OpenMP, MPI and fork are facilities that have been around for a while and are well understood. There is great support for these in modern C++ compilers on a single machine. If you are looking to scale to multiple nodes things are a little different. The OP mentioned multiple cores.
Also, how you achieve parallelism could be another point. You can use different processes or different threads. For different processes the solutions are well understood and the OS will take care of orchestrating the work for you. If we are talking about multiple threads, things are sort of easy if there isn't any modification of shared resources. Now, if you want to use multiple threads with shared resources and allow them to be modified, you are opening a can of worms that can unravel pretty quickly.
I know you mention TBB and PPL. These are fairly recent libraries. The facilities I mentioned predate those. Also, I think
pthreads have been around since the 90s.
Don't get get me wrong. The newer facilities make parallel programming easier but the problems and solutions for parallelization are well understood and solutions have existed for years. This is nothing new.