一句话汉字去重

[x for x in set(open("train.txt").read()) if 19968<=ord(x)<=40869]

有点绝。。。做ASR训练的时候处理语料库想了半天怎么提取字典