I'm not quite sure what you mean here, but this version has serious performance losses compared to your last version, I'll try to make sense of why over the weekend.
Sorry my english isn't so good to give a clear explanation.
If you try to run my last "all parallel version" or one of yours two (or more) times on the same points coud, you'll see the results are different each time.
This is due to the calling of
tree.NearestNeighbours(pt, 5) in a parallel execution. There may be as different
pt as the number of cores processed at the same time and the
pt returned as Item1 of the tuple may be different from the one used to compute the vector in the tuple Item2.
Before improving speed performances we have to make sure the method returns the expected results. IOW, start with a strong sequential method and then try to parallelize what can be without returning wrong results.