A cynic would say that DDR5 really was created because the density of RAM is getting… well… denser. Yes, DDR4 allows for densities of up to 16Gbit (2GB) per IC. Yes, when DDR4 came about that was a massive amount on a single IC. Those days are in the past. These days modern node sizes are so much smaller that this limitation means either wasting space on the wafer or making teeny-tiny ICs. Both solution(s) cost manufacturers money and reduce the profitonium included with every DDR4 RAM IC sale.
This is, arguably, why DDR5 allows for up to 64Gbit (8GB) ICs. Since a typical single sided DIMM has room for 8 ICs, and dual sided has room for 16 ICs; this means, in theory, we could see a single stick of DDR5 with 128GB of capacity instead of the maximum of 32GB with DDR4. Yes, for the average user who uses at most two 16GB sticks this is a nothing burger… as they still will want/need to populate two DIMMS. It only is in the enterprise market that this is a major improvement (think of a single server with only 8 DIMMS populated having 1TB of RAM). People will point to the fact that the number of pins has not changed, the form-factor really hasn’t changed (beyond a different key layout) and it was ‘only’ this increase density for why DDR5 came about.
We are cynical. We are jaded. We are not that cynical or that jaded… yet. DDR4 vs. DDR5 differences boil down to more, and are a bit more nuanced, than ‘density’. Yes, density needs to increase if we are to keep amount of RAM per core at reasonable levels in the face of 16/24 (and even more) mega-core-count CPUs going mainstream. There are however major architectural differences that will pay dividends in the future, and some even pay dividends right now.
For example, while yes the bus width per channel has not changed (much)… it is different. DDR4 uses a single 64-bit wide bus for each channel (i.e. with ‘dual’ Integrated Memory controllers the bus is 128-bits wide, with ‘quad’ it is 256-bit wide memory bus etc. etc. etc.) and offers a burst speed of 64bytes per channel (i.e. a 8 burst length aka eight word lengths of data in one ‘burst’ aaka 64 data bits x 8). DDR5 uses two INDEPENDENT 32bit sub-channels for the bus (with each side of the DIMM getting its own sub-channel if both sides are populated) and offers a burst speed capable of squirting 64bytes (i.e. 32bits multiplied by a 16 burst length). Per sub-channel. 64 bytes is a big deal as that is the typical CPU cache line size… meaning in a typical dual channel system DDR5 allows four full cache lines to be independently filled/emptied in a single go instead of just two. Which turns a dual channel into the equivalent of what a last gen quad channel system could do.