I've given a complete set of macro definitions using the C preprocessor that enables a C program to access a hardware UART with programmed input/output.
The input and output subroutines use spinlocks. The receiver spins until the empty flag in the status register goes away. Reading the data register makes the status register go empty again. The actual hardware device might have a receive FIFO, so instead of going empty, the next character from the FIFO would become available straightaway.
The output function is exactly the same in principle, except it spins while the device is still busy with any data written previously.