Cloud top heights retrieved from Geostationary Operational Environmental Satellite (GOES) data are evaluated using comparisons to 5 years of surface-based cloud radar and lidar data taken at the Atmospheric Radiation Measurement (ARM) program’s site near Lamont, Oklahoma. Separate daytime and nighttime algorithms developed at NASA Langley Research Center (LaRC) applied to GOES imager data and an operational CO2-slicing technique applied to GOES sounder data are tested. Comparisons between the daytime, nighttime and CO2-slicing cloud top heights and the surface retrievals yield mean differences of -0.84 ± 1.48 km, -0.56 ± 1.31 km, and -1.30 ± 2.30 km, respectively, for all clouds. The errors generally increase with increasing cloud altitude and decreasing optical thickness. These results, which highlight some of the challenges associated with passive satellite cloud height retrievals, are being used to guide development of a blended LaRC/CO2-slicing cloud top height product with accuracies suitable for assimilation into weather forecast models.