From my understanding, memory training is/was a closely held secret of memory makers and EDA IP houses who sold memory controller IP to all the chip vendors. This in turn makes fully open motherboard firmware almost impossible as no one can write code for memory training to bring up the chip. That piece of code has to be loaded as a blob - if you can get the blob.
I think you're mixing different concepts. JEDEC doesn't define DDR4 training procedures so there isn't a secret that's being withheld. Everyone who implements a DDR4 controller has to develop and implement a training procedure to meet the specifications.
On a DDR4 motherboard the training would occur between the memory controller and the DDR4 RAM. The proprietary blob you need would include the communication with the memory controller and instructions to handle the training for that specific memory controller.
There are several open source DDR4 controllers in different states of usability. They have each had to develop their own implementations.
It is usually the IP licensing, as spinning a board isn't always complex.
Note, it is actually easier to profile a known dram chip set bonded to the PCB. A lot of products already do this like phones, tablets, and thin laptops.
Where as SSD drives being a wear item, should be removable by end users. =3
> as no one can write code for memory training to bring up the chip
Surely someone can do it, but it's probably too niche to do. The licensing fee is probably cheaper than corporation spinning the board and reverse engineer it and for hobbyists lower tier memory likely was fine.
That said given that such technology has become so much more accessible (you can certainly create FPGA board and wire it up to DDR4 using free tools and then get board made in China), it's probably a matter of time someone will figure this out.
Because DDR3/4/5 dies are made to a price with half to three quarters of their IO pins shared between the dies in parallel on a rank of a channel, and for capacity often up to around 6 ranks per channel. E.g. high capacity server DDR4 memory, say on AMD SP3, may have 108 dies on each of 8 channels of a socket.
So if you can move complexity over to the controller you can spend 100:1 ratio in unit cost.
So you get to make the memory dies very dumb by e.g. feeding a source synchronous sampling clock that's centered on writes and edge aligned on reads leaving the controller to have a DLL master/slave setup to center the clock at each data group of a channel and only retain a minimal integer PLL in the dies themselves.
Imprecision in manufacturing (adjust resistor values), different trace lengths (speed of light differences for parallel signals), etc... it's in the article.
Because when you change a PCB trace from 0 to 1 or 1 to 0, the slope of the signal as it changes from gnd to v+ (the signal voltage) or v+ to ground isn't perfect, and that slope is highly affected by the various pieces of metal and silicon and fiberglass that make up the board and the chips. The shape and topology of the PCB trace matters, as do slight imperfections in the solder, PCB material, the bond wires inside the chips, etc. These effectively create resistors/capacitors/inductors that the designer didn't intend, which effect the slope of the 0->1 1->0 changes. So for these high-speed signals, chip designers started adding parameters to tweak the signal in real-time, to compensate for these ill effects. Some parameters include a slight delay between the clock and data signals, to account for skew. Voltage adjustement to avoid ringing (changing v+). Adjusting the transistor bias to catch level transitions more accurately. Termination resistance adjustment, to dampen reflections. And on top of all that, some bits will still be lost but because these protocols are error-correcting, this is acceptable loss.
This is how people were able to send ethernet packets over barbed wire. Many bits are lost, but some get through, and it keeps trying until the checksums all pass.
On a DDR4 motherboard the training would occur between the memory controller and the DDR4 RAM. The proprietary blob you need would include the communication with the memory controller and instructions to handle the training for that specific memory controller.
There are several open source DDR4 controllers in different states of usability. They have each had to develop their own implementations.
Note, it is actually easier to profile a known dram chip set bonded to the PCB. A lot of products already do this like phones, tablets, and thin laptops.
Where as SSD drives being a wear item, should be removable by end users. =3
Surely someone can do it, but it's probably too niche to do. The licensing fee is probably cheaper than corporation spinning the board and reverse engineer it and for hobbyists lower tier memory likely was fine.
That said given that such technology has become so much more accessible (you can certainly create FPGA board and wire it up to DDR4 using free tools and then get board made in China), it's probably a matter of time someone will figure this out.
So if you can move complexity over to the controller you can spend 100:1 ratio in unit cost. So you get to make the memory dies very dumb by e.g. feeding a source synchronous sampling clock that's centered on writes and edge aligned on reads leaving the controller to have a DLL master/slave setup to center the clock at each data group of a channel and only retain a minimal integer PLL in the dies themselves.
This is how people were able to send ethernet packets over barbed wire. Many bits are lost, but some get through, and it keeps trying until the checksums all pass.