However, I just realized that version I posted above may be too slow if segmentation is large, so still I decided to post faster version but with duplicated faces (no way you can get both fast and correct) - here you have to overkill duplicate faces if you want, for what I strongly recommend that you don't - you'll just loose time, though I didn't checked - maybe this method is faster than with using my above posted variant...
Regards, M.R. (arch.)