https://github.com/onesuper/HuggingFace-Datasets-Text-Quality-Analysis/blob/92f66886bf96824ebbe59f55e037413069f8429c/app.py#L486 After lsh.query, before added into unique_documents, results.remove(str(i)) ?