Why do we use references at those voltages instead of just simply 2V and 4V
This can be advantageous in just the right circumstances when the microcontroller is displaying values directly to a human. However, most of the time it's because there are lots of people out there that are bad at math or don't stop and actually think.
As others have already shown, 2.048 = 211/1000 and 4.096 = 212/1000. If you use a 12 bit A/D with a 4.096 V reference, each count is 1 mV.
However, stop and consider when that actually matters. There is nothing inherently special about units of millivolts. In terms of physics, they are a totally arbitrary unit for measuring EMF.
In a control system, for example, the units used for the various measured quantities can be anything you like, as long as you know what they are. If you are using fixed point, then you want the maximum value to nearly fill the number, and use enough bits so that you have the necessary resolution. The scaling of units should be dictated by convenient internal binary representations.
There will inevitably be adjustable gain factors later in the process anyway. Custom scaling of all the input values can be adjusted for by using different values of gain factors that are already there, and that the system already has to handle arbitrary values of. No additional computation is required, only different values fed into the same computations.
In some cases, these small embedded systems need to display digital values to humans. In that case, units of millivolts are useful when you want to show a voltage with three decimal places. However, human interfaces by their nature are slow compared to microcontrollers. Generally you don't want to update a digital display at more than 2 Hz. Converting a number to decimal digits already requires some arithmetic anyway. Scaling some internal value to match the displayed resolution is a rather minor additional step relative to that process.
Then also consider how often you actually want to measure a voltage in the range of 0 to 4.095 V, or at least most of that range. If you want to measure 0 to 5 V, then the 4.096 reference really doesn't help. You need to attenuate the signal into the A/D anyway, so reading the attenuated signal in units of millivolts confers no special advantage, even when displaying digital values.
So in short, in today's world with microcontrollers handling A/D readings, 2.048 and 4.096 V references mostly cater to a perceived need, and to knee-jerkers who don't think about the problem properly.