Raw images are essentially all of the light data captured by the image sensor at the time of capture dumped into a file. Jpegs are compressed versions of those images which have been processed by the camera automatically based on a color profile. So the raw images contain far more information than the jpegs do, a lot of that information is selectively thrown out.
For instance, you will notice that if you try to shoot a scene with overexposed whites or underexposed blacks, where large portions of the image contain very similar dark or light tones with only very subtle differences between the tones, as you start to lift the shadows and push those tones around with color correction the actual gradation between them may start to break and result in banding. This is because there is not enough color depth there.
Lets break down why this is. Remember a bit is just a binary for 1 or 0 so there are essentially only 2 values it can have. 8 bit color makes sense because theres 8 bits to a byte--but if we were to break down the possible combinations we come up with 2 * 2 * 2 * 2 * 2 * 2 * 2 * 2 or 2 to the power of 8 which is 256. That means there is 256 possible tonal values for R, G, and B, which combined together make white and in absence of all 3 make black using additive color theory. (which is the kind of color we are dealing with when we mix colored light, versus subtractive color theory which is the kind of color mixing that happens when we mix paint)
All things said and done 256 * 256 * 256 is still 16,777,216 possible colors to choose from but that mainly represents every possibility along the color spectrum rather than all of that color depth going to tonal values. To the untrained layman, 8 bit color might look indistinguishable from 10 or 12 bit color, but a colorist or video editor can pick out the subtle differences. But that subtle color information is not the reason that 10 bit color is so sought after from video professionals nowadays, though it does tend to give images that magical "something special. The reason 10 bit video (most often shot in Log profile, which i'll explain in a moment) is so sought after is because it has far more flexibility for color correction, as those subtle color differences get exaggerated more and more the more we stretch the values.
Back to RAW images. What bit depth are they? Well, this actually depends on the camera manufacturer and the sensor. Canon's CR2 raw files support up to and are read back in 16 bit color. 16 bit means 2 to the power of 16, or 65536 possibilities for red, green and blue. I'll let you multiply that 3 times and see if that number even fits on your calculator. Now, whether or not the sensor itself can even capture that amount of color depth; doubtful. But it will throw in that container all the information it does capture.
So the bit depth of the container itself and how much actual information we capture is not really on the same page and we need a way of measuring that information. One of those ways is in measuring dynamic range. Dynamic range is a way of saying how many "stops" of light we are able to capture at once from a scene before the detail falls apart. The difference between one stop of light and the next is exactly double, so a measure of 12 stop of dynamic range is essentially what we get if we take the lowest value in the scene, and double its brightness 12 times before the values start to fall apart. Around 12 stops is pretty standard for most DSLRs and is still considered standard dynamic range. There is some debate about where what is considered high dynamic range begins, with most agreeing it starts at anything above 13 stops of dynamic range, but definitively existing above 14 stops. The blackmagic pocket 4k camera, a popular cinema camera for shooting RAW video, for instance, shoots 14 stops of dynamic range.
One important thing to note is that even if a camera is capable of 12 or even 14 stops of dynamic range, it has to kind of squash that dynamic range to fit it inside the jpeg container, and the algorithm that does this is automatic and thus not as sophisticated as if it was edited by a human being. This is especially the case with the kinds of HDR photos new phones might take, where it takes a moment to process because the camera is literally squeezing and compressing that high dynamic range into a smaller container. The phone is internally doing a bit of what we would call tone mapping, mapping something of high dynamic range onto a container of lower dynamic range, which we can do manually in photoshop. But phones are getting better and better at using AI image enhancement to make these tone mapping choices automatically versus needing a skilled colorist, though certain examples of more aggressive tone mapping can stick out like a sore thumb as fake or overly processed.
In order to see high dynamic range images WITHOUT tone mapping, we need HDR monitors or TVs. Otherwise, the typical color profile that most televisions will run at is something akin to Rec-709... Contrasty, like a jpeg. To the untrained eye this might seem more appealing, because non-rec709 content that uses a linear or Log profile looks flat by comparison. And truly, Log, or logarithmic color, in all its many forms (DLog, CLog, SLog, etc.) is called a flat profile for a reason. This is because everything is intentionally weighted towards your gamma, or middle grey, so when you actually store that information it has more room in the blacks or the whites to breathe. You're not actually increasing the space in that room, you're just squishing everything towards the center of the room so you don't hit a wall on either side. The effect of an image being "blown out" is essentially just that, it is the white values clipping after hitting the end of their container and having nowhere to go and just kind of smashing their face against that wall.
This is why its kind of useless to shoot in a flat profile if you're just shooting 8 bit. You've got such limited space to work with, it doesn't make sense not to spread all of your information out and fill up the entire space. Increasing that to 10 bit increases the space you have to work with, so adding log to that can help to ensure nothing really clips up against the walls of that container, so 10 bit log is a very popular format. Prores video can be from 10 bit up to 12 bit and 14 bit RAW video may straight up be an image sequence of CDNG files, basically just a bunch of raw stills for each frame with an audio file, though we do have actual lossless or semi-lossless video formats for that now. So if you ever wonder why taking a screenshot of a video frame sometimes looks "crappier" than if you just shot a still, bit depth might be the reason, though the quality of RAW video is effectively the same as raw stills... 10 bit log, raw video, prores, these are all advancements in video formats to try to achieve the same level of quality we can from a stills camera.