Nathan S. answered 04/21/25
Professional Data Scientist and Machine Learning Engineer
You realize that your results are dominated by temperature measurements - individuals with similar temperature measurements are clustered together to a relatively high degree, while individuals with similarities in other measurements are not. This is because when you added forehead and oral temperature, you essentially included the same concept, body temperature, twice. This is bad for two reasons - one: it over-weights body temperature as a concept (remember how it seemed like our clusters put the people with similar temperatures together but didn't necessarily do the same for other types of symptoms?) and two: it ruins the story our cluster model tells. Clustering is useful for its ability to segment data (patients, customers, etc) into intelligible and meaningful groups, so when we overweight one concept, we make it harder for our audience to read the segments in the underlying data.
Note that this assumes that both temperature readings are correlated and essentially representative of the same underlying concept. If there were some meaningful information to be gleaned from the difference between the two measurements then we might want to include both features.