Isn't the keyword here "overdubbing?"
I tried it once with audacity, working both as recording and playback software. It was super easy to do and there is no need for syncing in post. All done automatically while overdubbing.
Ed Sheeran does this even in a live performance on stage; he had developed* the X-looper pedal for this, a multitrack overdubbing/looper footpedal. See here him using it in a studio session or even live. You will notice a small black box on his chest with a cable to his left ear: wireless headphone without latency for his own playback. (Avoiding the bounce, Peyton talking about; but still looking nice for the camera with fat cans on his head)
If you are not life, no need for extra hardware. Just use a computer with mic and headphones (latency free, avoid bluetooth) and any software that allows overdubbing. Or use the built in functionality of a recorder like Zoom H5. But with that tiny buttons and the small display it looks way more complicated than audacity.
The only caveat is to sync the video-recording(s) to the final audio track. But if you are only one or maybe two frames off, it is still acceptable. A perfectly synced audio track is way more important.
* actually a renowned audio company developed the "Sheeran X Looper", but Ed Sheeran set up the specification for his needs and was deeply involved during development as tester and giving feedback. And now his name is used for marketing puposes.