You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I see issues with node deduplication. Ingesting transcripts of informal conversations, I am getting, for example, duplicate nodes for what is clearly the same person, e.g., "John Doe" and "John T. Doe". Is there a way to have more control over this, of even post-training, a capability to collapse these nodes into a single one?
The text was updated successfully, but these errors were encountered:
I've posted the same question on discord yesterday. I'm testing LightRAG with scientific papers which use abbreviated names very often. The same is for entities that are rephrased.
@rabner - I guess I could try to preprocess the raw text and normalize the names before training, but being able to have some control over the deduping seems like a basic capability that is still missing.
I see issues with node deduplication. Ingesting transcripts of informal conversations, I am getting, for example, duplicate nodes for what is clearly the same person, e.g., "John Doe" and "John T. Doe". Is there a way to have more control over this, of even post-training, a capability to collapse these nodes into a single one?
The text was updated successfully, but these errors were encountered: